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(54) Title: HYDROLASE BINDING ASSAY 



(57) Abstract 



Disclosed is a binding assay for proteases and phosphatases, which contain cysteine in their binding sites or as a necessary structural 
component for enzymatic binding. The sulfhydryl group of cysteine is the nucleophilic group in the enzyme's mechanistic proteolytic and 
hydrolytic properties. The assay can be used to determine the ability of new, unknown ligands and mixtures of compounds to competitively 
bind with the enzyme versus a known binding agent tor the enzyme, e.g., a known enzyme inhibitor. By the use of a mutant form of the 
natural or native wild-type enzyme, in which serine, or another amino acid, e.g., alanine, replaces cysteine, the problem of interference 
from extraneous oxidizing and alkylating agents in the assay procedure is overcome. 'Hie interference arises because of oxidation or 
alkylation of the sulfhydryl. - SH (or -S"), in the cysteine, which then adversely affects the binding ability oHhe enzyme. Specifically 
disclosed is an assay for tyrosine phosphatases and cysteine proteases, including capsases and cathepsins, e.g., Cathepsin K(02), utilizing 
scintillation proximity assay (SPA) technology. The assay has important applications in the discovery of compouds for the treatment and 
study of for example, diabetes, immunosuppression, cancer, Alzheimer's disease and osteoporosis. The novel feature of the use of a mutant 
enzyme can be extended to its use in a wide variety of conventional colorimetric, photometric, spectrophotometry, radioimmunoassay and 
ligand binding competitive assays. 
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TITLE OF THE INVENTION 

HYDROLASE BINDING ASSAY 



FIELD OF THE INVENTION 

This invention relates to the use of mutant phosphatase 
10 and protease enzymes in a competitive binding assay. Specific 
examples are the enzymes, tyrosine phosphatase and cysteine 
protease, e.g. Cathepsin K, and the assay specifically described is a 
scintillation proximity assay using a radioactive inhibitor to induce 
scintillation. 

is BACKGROUND OF THE INVENTION 

The use of the scintillation proximity assay (SPA) to 
study enzyme binding and interactions is a new type of 
radioimmunoassay and is well known in the art. The advantage of 
SPA technology over more conventional radioimmunoassay or 

20 ligand-binding assays, is that it eliminates the need to separate 

unbound ligand from bound ligand prior to hgand measurement. See 
for example, Nature, Vol, 341, pp. 167-178 entitled "Scintillation 
Proximity Assay " by N. Bosworth and P. Towers, Anal. Biochem. 
Vol. 217, pp. 139-147 (1994) entitled "Biotinylated and Cysteme- 

25 Modified Peptides as Useful Reagents For Studying the Inhibition of 
Cathep^n G" by A.M. Brown, et ah, Anal Biochem. Vol. 223, pp. 259- 
265 (1994) entitled "Direct Measurement of the Binding of RAS to 
Neurofibromin Using Scintillation Proximity Assay" by R. H. 
Skinner et al and Anal Biochem. Vol. 230, pp. 101-107(1995) entitled 
30 "Scintillation Proximity Assay to Measure Binding of Soluble 
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Fibroncctin to Antibody-Captured alphaftfii Integrin" by J. A. 
Pachter ct al. 

The basic principle of the assay lies in the use of a solid 
support containing a scintillation agent, wherein a target enzyme is 
5 attached to the support through, e.g., a second enzyme-antienzyme 
linkage. A known tritiated or ll25 iodinated binding agent, i.e., 
radioligand inhibitor ligand for the target enzyme is utilized as a 
control, which when bound to the active site in the target enzyme, is 
in close proximity to the scintillation agent to induce a scintillation 

10 signal, e.g., photon emission, which can be measured by 

conventional scintillation/radiographic techniques. The unbound 
tritiated (hot) ligand is too far removed from the scintillation agent to 
cause an interfering measurable scintillation signal and therefore 
does not need to be separated, e.g., filtration, as in conventional 

15 ligand-binding assays. 

The binding of an unknown or potential new ligand 
(cold, being non-radioactive) can then be determined in a competitive 
assay versus the known radioligand, by measuring the resulting 
change in the scintillation signal which will significantly decrease 

20 when the unknown ligand also possesses good binding properties. 

However, a problem arises when utilizing a target 
enzyme containing a cysteine group, having a free thiol linkage, - 
SH,(or present as -S" ) which is in the active site region or is closely 
associated with the active site and is important for enzyme-ligand 

25 binding. If the unknown ligand or mixture, e.g. natural product 
extracts, human body fluids, cellular fluids, etc. contain reagents 
which can alkylate, oxidize or chemically interfere with the cysteine 
thiol group such that normal enzyme-ligand binding is disrupted, 
then false readings will occur in the assay. 

30 What is needed in the art is a method to circumvent and avoid 

the problem of cysteine interference in the scintillation proximity assay (SPA) 
procedure in enzyme binding studies. 
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SUMMARY OF THE INVENTION 

We have discovered that by substituting serine for 
cysteine in a target enzyme, where the cysteine plays an active role in 
the wild-type enzyme-natural ligand binding process, usually as the 
5 catalytic nucleophile in the active binding site, a mutant is formed 
which can be successfully employed in a scintillation proximity assay 
without any active site cysteine interference. 

This discovery can be utilized for any enzyme which 
contains cysteine groups important or essential for binding and/or 
]() catalytic activity as proteases or hydrolases and includes 

phosphatases, e.g., tyrosine phosphatases and proteases, e.g. 
cysteine proteases, including the cathepsins, i.e., Cathepsin K (02) 
and the capsases. 

Further, use of the mutant enzyme is not limited to the 
15 scintillation proximity assay, but can be used in a wide variety of 
known assays including colorimetric, spectrophotometric, ligand- 
binding assays, radioimmunoassays and the like. 

We have furthermore discovered a new method of 
amplifying the effect of a binding agent ligand, e.g., radioactive 
20 inhibitor, useful in the assay by replacing two or more 

phosphotyrosine residues with 4-phosphono(difluoromethyl) 
phenylalanine (F2Pmp) moieties. The resulting inhibitor exhibits a 
greater and more hydrolytically stable binding affinity for the target 
enzyme and a stronger scintillation signal. 
25 By this invention there is provided a process for 

determining the binding ability of a ligand to a cysteine-containing 
wild-type enzyme comprising the steps of: 

(a) contacting a complex with the ligand, the complex 
comprising a mutant form of the wild-type enzyme, 
30 in which cysteine, at the active site, is replaced 

with serine, in the presence of a known binding 
agent for the mutant enzyme, wherein the binding 
agent is capable of binding with the mutant 
enzyme to produce a measurable signal. 

35 
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Further prodded is a process for determining the 
binding ability of a ligand, preferably a non-radioactive (cold) ligand, 
to an active site cysteine-containing wild-type tyrosine phosphatase 
comprising the steps of: 
5 (a) contacting a complex with the ligand, the complex 

comprising a mutant form of the wild-type enzyme, 
the mutant enzyme being PTP1B, containing the 
same amino acid sequence 1-320 as the wild type 
enzyme,except at position 215, in which cysteine is 
10 replaced with serine in the mutant enzyme, in the 

presence of a known radioligand binding agent for 
the mutant enzyme, wherein the binding agent is 
capable of binding with the mutant enzyme to 
produce a measurable beta radiation-induced 
15 scintillation signal. 

Also provided is a new class of peptide binding agents selected 
from the group consisting of: 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromcthyl)]-L-phenylalanyl-[4- 
20 phosphono(difluoromethyl)]-L-phenylalanineamide (BzN-EJJ-CONH2), where 

E is glutamic acid and J is 4-phosphono(difluoro-methyl)|-L-phenylalanyl; 
N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide; 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
25 phosphono(difluoromethyl )]-L-phenylalanine amide; 

L«Glutamyl-l4-phosphono(difluoromethyl)]-L-phenylalanyI-|4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 

L-LysinyI-|4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 
30 L-Serinyl-|4-phosphono(difluoromethyl)l-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 

L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl [4 phosphono- 
(difluoromethyl)]-L-phenylalanine amide; and 

L-Isoleucinyl-f4-phosphono(difluoromethyl)l-L-phenyIalanyl-[4-phosphono- 
35 (difluoromethyl)]-L-phenylalanine amide; and their tritiated and I '25 jodinated 
derivatives. 
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Further provided is a novel tritiated peptide, tritiated 
BzN-EJJ-CONH2, being N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono-(difluoromethyl)]-L-phenylalanyl-[4-phosphono(difluoro- 
methyl)]-L-phenylalanineamide, wherein E as used herein is 
5 glutamic acid and J, as used herein, is the (F2Pmp) moiety, (4- 
phosphono(difluoromethyl)-phenylalanyl). 

Furthermore there is provided a process for increasing 
the binding affinity of a ligand for a tyrosine phosphatase or cysteine 
protease comprising introducing into the ligand two or more 4- 
10 phosphono(difluoromethyl)-phenylalanine groups; also provided is 
the resulting disubstituted ligand. 

In addition there is provided a complex comprised of: 

(a) a mutant form of a wild-type enzyme, in which 
cysteine, necessary for activity in the active site, is 

15 replaced with serine and is attached to: 

(b) a solid support. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 illustrates the main elements of the invention 

20 including the scintillation agent 1, the supporting (fluomicrosphere) 
bead 5, the surface binding Protein A 10, the linking anti-GST 
enzyme 15, the fused enzyme construct 20, the GST enzyme 25, the 
mutant enzyme 30, the tritiated peptide inhibitor 35, the beta 
radiation emission 40 from the radioactive peptide inhibitor 35 and 

25 the emitted light 45 from the induced scintillation. 

FIGURE 2 (A and B) illustrates the DNA and amino acid 
sequences for PTP1B tyrosine phosphatase enzyme, truncated to 
amino acid positions 1-320. (Active site cysteine at position 215 is in 
30 bold and underlined). 

FIGURE 3 (A, B and C) illustrates the DNA and amino 
acid sequences for Cathepsin K. The upper nucleotide sequence 
represents the cathepsin K cDNA sequence which encodes the 
35 cathepsin K preproenzyme (indicated by the corresponding three 

letter amino acid codes). Numbering indicates the cDNA nucleotide 
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position. The underlined amino acid is the active site Cys 1 '™ residue 
that was mutated to either Ser or Ala. 

FIGURE 4 (A and B) illustrates the DNA and amino acid 

5 sequences for the capsase, apopain. The upper nucleotide sequence 

represents the apopain (CPP32) cDNA sequence which encodes the 

apopain proenzyme (indicated by the corresponding three letter 

amino acid codes). Numbering indicates the cDNA nucleotide 

163 

position. The underlined amino acid is the active site Cys 1 residue 
10 that was mutated to Ser. 

DETAILED DESCRIPTION OF THE INVENTION 

The theory underlying the main embodiment of the 
invention can be readily seen and understood by reference to 
15 FIGURE 1. 

Scintillation agent 1 is incorporated into small (yttrium 
silicate or PVT fluomicro-spheres, AMERSHAM) beads 5 that 
contain on their surface immunosorbent protein A 1Q. The protein A 
coated bead 5 binds the GST fused enzyme construct 20, containing 

20 GST enzyme 25 and PTP1B mutant enzyme 2Q, via anti-GST enzyme 
antibody 15. When the radioactive e.g., tritiated, peptide 35 is bound 
to the mutant phosphatase enzyme 3Q, it is in close enough proximity 
to the bead 5 for its beta emission 40 (or Auger electron emission in 
the case of 1125) to stimulate the scintillation agent 1 to emit light 

25 (photon emission) 4§. This light 4£ is measured as counts in a beta 
plate counter. When the tritiated peptide 35 is unbound it is too 
distant from the scintillation agent 1 and the energy is dissipated 
before reaching the bead 5, resulting in low measured counts. Non- 
radioactive ligands which compete with the tritiated peptide 35 for the 

30 same binding site on the mutant phosphatase enzyme 30 will remove 
and/or replace the tritiated peptide 35 from the mutant enzyme 30 
resulting in lower counts from the uncompeted peptide control. By 
varying the concentration of the unknown ligand and measuring the 
resulting lower counts, the inhibition at 50%(IC50) for ligand binding 

35 to the mutant enzyme 30 can be obtained. This then is a measure of 
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the binding ability of the ligand to the mutant enzyme and the wild- 
type enzyme. 

The term "complex'' as used herein refers to the 
assembly containing the mutant enzyme. In its simplest 
5 embodiment, the complex is a solid support with the mutant enzyme 
attached to the surface of the support. A linker can also be employed. 
As illustrated in FIGURE 1, the complex can further comprise a bead 
(fluopolymer), anti-enzyme GST/enzyme GST-mutant enzyme-PTPl 
linking construct, immunosorbent protein A, and scintillation agent. 

10 In general, the complex requires a solid support (beads, 

immunoassay column of e.g., AI2O3, or silica gel) to which the 
mutant enzyme can be anchored or tethered by attachment through a 
suitable linker, e.g., an immunosorbent (e.g, Protein A, Protein G, 
anti-mouse, anti-rabbit, anti-sheep) and a linking assembly, 

15 including an enzyme/anti-enzyme construct attached to the solid 
support. 

The term "cysteine-containing wild-type enzyme",as 
used herein, includes all native or natural enzymes, e.g., 
phosphatases, cysteine proteases, which contain cysteine in the 

20 active site as the active nucleophile, or contain cysteine clearly 

associated with the active site that is important in binding activity. 

The term "binding agent" as used herein includes all 
ligands (compounds) which are known to be able to bind with the 
wild-type enzyme and usually act as enzyme inhibitors. The binding 

25 agent carries a signal producing agent , e.g., radionuclide, to initiate 
the measurable signal. In the SPA assay the binding agent is a 
radioligand. 

The term "measurable signal" as used herein includes 
any type of generated signal, e.g., radioactive, colorimetric, 
30 photometric, spectrophotometry, scintillation, which is produced 

when binding of the radioligand binding agent to the mutant enzyme. 

The present invention assay further overcomes problems 
encountered in the past, where compounds were evaluated by their 
ability to affect the reaction rate of the enzyme in the phosphatase 
35 activity assay. However this did not give direct evidence that 

compounds were actually binding at the active site of the enzyme. 
The herein described invention binding assay using a substrate 
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analog can determine directly whether the mixtures of natural 
products can irreversibly modify the active site cysteine in the target 
enzyme resulting in inhibition of the enzymatic activity. To overcome 
inhibition by these contaminates in the phosphatase assay, a mutated 
5 Cys(215) to Ser(215) form of the tyrosine phosphatase PTP1B was 

cloned and expressed resulting in a catalytically inactive enzyme. In 
general, replacement of cysteine by serine will lead to a catalytically 
inactive or substantially reduced activity mutant enzyme. 

10 PTP1B is the first protein tyrosine phosphatase to be 

purified to near homogeneity {Tonks et al JBC 263, 6731-6737 (1988)} 
and sequenced by Charbonneau et al. PNAS 85, 7182-7186 ( 1988). The 
sequence of the enzyme showed substantial homology to a duplicated 
domain of an abundant protein present in hematopoietic cells 

15 variously referred to as LCA or CD45. This protein was shown to 
possess tyrosine phosphatase activity {Tonks et al. Biochemistry 27, 
8695-8701 (1988)}. Protein tyrosine phosphatases have been known to 
be sensitive to thiol oxidizing agents and alignment of the sequence of 
PTP1B with subsequently cloned Drosophila and mammalian 

20 tyrosine phosphatases pointed to the conservation of a Cysteine 

residue {(M. Strueli et al Proc. Natl Acad USA, Vol. 86, pp. 8698-7602 
(1989)} which when mutated to Ser inactivated the catalytic activity of 
the enzymes. Guan et a/.(1991) {J.B.C. Vol 266, 17926-17030, 1991} 
cloned the rat homologue of PTP1B, expressed a truncated version of 

25 the protein in bacteria, purified and showed the Cys at position 215 is 
the active site residue. Mutation of the Cys^l^ to Ser^l^ resulted in 
loss of catalytic activity. Human PTP1B was cloned by ChernofF et al. 
Proc. Natl. Acad. Sci. USA 87, 2735-2739 (1990). 

Work leading up to the development of the substrate 

30 analog BzN-EJJ-CONH2 for PTP1B was published by T. Burke et al. 

Biochem. Biophys. Res. Comm. 205, pp. 129-134 (1994) with the 
synthesis of the hexamer peptide containing the phosphotyrosyl 
mimetic F2Pmp. We have incorporated the (F2Pmp) moiety (4- 

phosphono-(difluoromethyl)phenylalanyl) into various peptides that 
35 led to the discovery of BzN-EJJ-CONH2, (where K is glutamic acid 

and J as used herein is the F2Pmp moiety) an active (5 nM) inhibitor 
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of PTP1B. This was subsequently tritiated giving the radioactive 
substrate analog required for the binding assay. 

The mutated enzyme, as the truncated version, 
containing amino acids 1-320 (see FIGURE 2), has been demonstrated 
5 to bind the substrate analog Bz-NEJJ-CONH2 with high affinity for 
the first time. The mutated enzyme is less sensitive to oxidizing 
agents than the wild-type enzyme and provides an opportunity to 
identify novel inhibitors for this family of enzymes. The use of a 
mutated enzyme to eliminate interfering contaminates during drug 

10 screening is not restricted to the tyrosine phosphatases and can be 
used for other enzyme binding assays as well. 

Other binding assays exist in the art in which the basic 
principle of this invention can be utilized, namely, using a mutant 
enzyme in which an important and reactive cysteine important for 

15 activity can modified to serine (or a less reactive amino acid) and 

render the enzyme more stable to cysteine modifying reagents, such 
as alkylating and oxidizing agents. These other ligand-binding 
assays include, for example, colorimetric and spectrophotometry 
assays, e.g. measurement of produced color or fluorescence, 

20 phosphorescence (e.g. ELISA, solid absorbant assays) and other 

radioimmunoassays in which short or long wave light radiation is 
produced, including ultraviolet and gamma radiation). 

Further, the scintillation proximity assay can also be 
practiced without the fluopolymer support beads (AMERSHAM) as 

25 illustrated in FIGURE 1. For example, Scintistrips® are 

commercially available (Wallac Oy, Finland) and can also be 
employed as the scintillant-containing solid support for the mutant 
enzyme complex as well as other solid supports which are 
conventional in the art. 

30 The invention assay described herein is applicable to a 

variety of cysteine-containing enzymes including protein 
phosphatases, proteases, lipases, hydrolases, and the like. 

The cysteine to serine transformation in the target 
enzyme can readily be accomplished by analogous use of the 

35 molecular cloning technique for Cys^l^ to Ser^lS described in the 
below-cited reference by M. Strueli et aL, for PTP1B and is hereby 
incorporated by reference for this particular purpose. 



-9- 



WO 98/20156 



PCT/CA97/00825 



A particularly useful class of phosphat ases is the 

tyrosine phosphatases since they are important in cell function. 

Examples of this class are: PTP1B, LCA, LAK, DLAR, DPTPfSee 

Strueli et al., below). Ligands discovered by this assay using, for 
5 example, PTP1B can be useful, for example, in the treatment of 

diabetes and immunosuppression. 

A useful species is PTP1R, described in Proc. Natl Acad 

USA, Vol. 86, pp. 8698-7602 by M. Strueli et al. and Proc. Natl Acad 

Sci. USA, Vol 87, pp. 2735-2739 by J. Chernofife* al 
10 Another useful class of enzymes is the proteases, 

including cysteine proteases (thiol proteases), cathepsins and 

capsases. 

The cathepsin class of cysteine proteases is important 
since Cathepsin K (also termed Cathepsin 02, see Biol. Chem. Hoppe- 

15 Seyler, Vol 376 pp. 379-384, June 1995 by D. Bromme et al) is 
primarily expressed in human osteoclasts and therefore this 
invention assay is useful in the study and treatment of osteoporosis. 
See US Patent 5,501,969 (1996) to Human Genome Sciences for the 
sequence, cloning and isolation of Cathepsin K (02). See also J. Biol. 

20 Chem. Vol 271, No. 21, pp. 12511-12516 (1996) by F. Drake et ai and 
BioL Chem. Hoppe-Seyler, Vol. 376, pp. 379-384(1985) by D. Bromme et 
al., supra. 

Examples of the cathepsins include Cathepsin B, 
Cathepsin G, Cathepsin J, Cathepsin K(02), Cathesin L, Cathepsin 
25 M, Cathepsin S. 

The capsase family of cysteine proteases are other 
examples where the SPA technology and the use of mutated enzymes 
can be used to determine the ability of unknown compounds and 
mixtures of compounds to compete with a radioactive inhibitor of the 
30 enzyme. An active site mutant of Human Apopain CPP32 (capsase-3) 
has been prepared. The active site thiol mutated enzymes are less 
sensitive to oxidizing agents and provide an opportunity to identify 
novel inhibitors for this family of enzymes. 

Examples of the capsase family include: capsase-KICE), 
35 capsase-2 (ICH-1), capsase-3 (CPP32, human apopain, Yama), 

capsase-4(ICE r el-ll, TX, ICH-2), capsase-5(ICE re l-lll, TY), capsase- 
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6(Mch2), capsase-7(Mch3, ICE-LAP3, CMH-1), capsase-8(FLICE, 
MACH, Mch5) f capsasc-9 (ICE-LAP6, Mch6) and capsase-10(Mch4). 

Substitution of the cysteine by serine (or by any other 
amino acid which lowers the activity to oxidizing and alkylating 

5 agents, e.g., alanine) does not alter the binding ability of the mutant 
enzyme to natural ligands. The degree of binding, i.e., binding 
constant, may be increased or decreased. The catalytic activity of the 
mutant enzyme will, however, be substantially decreased or even 
completely eliminated. Thus, natural and synthetic ligands which 

10 bind to the natural wild-type enzyme will also bind to the mutant 
enzyme. 

Substitution by serine for cysteine also leads to the 
mutant enzyme which has the same quantitative binding ability as 
the natural enzyme but is significantly reduced in catalytically 

15 activity. Thus, this invention assay is actually measuring the true 
binding ability of the test ligand. 

The test ligand described herein is a new ligand 
potentially useful in drug screening purposes and its mode of action 
is to generally function as an inhibitor for the enzyme. 

20 The binding agent usually is a known ligand used as a 

control and is capable of binding to the natural wild-type enzyme and 
the mutant enzyme employed in the assay and is usually chosen as a 
known peptide inhibitor for the enzyme. 

The binding agent also contains a known signal- 

25 producing agent to cause or induce the signal in the assay and can be 
an agent inducing e.g., phosphorescence or fluorescence (ELISA), 
color reaction or a scintillation signal. 

In the instant embodiment, where the assay is a 
scintillation assay, the signal agent is a radionuclide, i.e., tritium, 

30 I 125 , which induces the scintillant in the solid support to emit 
measurable light radiation, i.e., photon emission, which can be 
measured by using conventional scintillation and beta radiation 
counters. 

We have also discovered that introducing two or more 4- 
35 phosphonodifluoromethyl phenylalanine (F2Pmp) groups into a 
known binding agent greatly enhances the binding affinity of the 
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binding agent to the enzyme and improves its stability by rendering 
the resulting complex less susceptible to hydrolytic cleavage. 

A method for introducing one F2Pmp moiety into a 

ligand is known in the art and is described in detail in Biochem. 
5 Biophys. Res. Comm . Vol. 204 , pp. 129-134 (1994) hereby incorporated 
by reference for this particular purpose. 

As a result of this technology we discovered a new class 
of ligands having extremely good binding affinity for PTP1B. These 
include: 

1(1 N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenyl- 
alanyl-[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl- 
[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 

15 phosphono(difluoromethyl)]-L-phenylalanine amide, 

L-Lysinyl-[4-phosphono(difluoromethyl )]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Serinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 

20 L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, and 
L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide. 

25 A useful ligand in the series is Bz-NEJJ-CONH2, whose chemical 
name is: N-Benzoyl-L-glutamyl-[4-phosphono(difluoro-methyl)]-L- 
phenylalanyl-[4-phosphono(difluoromethyl)]-L-phenyl-alanineamide, 
and its tritiated form, N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

30 (dilfuoromethyl)]-L-phenylalanineamide. 

Synthesis of both cold and hot ligands is described in the 

Examples. 

The following Examples are illustrative of carrying out 
the invention and should not be construed as being limitations on the 
35 scope or spirit of the instant invention. 
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EXAMPLES 

1. Preparation of PTP1B Truncate (Amino Acid Sequence from 1-320) 
and Fused GST-PTP1B Construct 

5 An E, coli culture carrying a PET plasmid expressing 

the full length PTP1B protein was disclosed in J. Chernoff e£ al. Proc 
Natl. Acad Sci. USA , 87, pp. 2735-2739, (1990). This was modified to 
a truncated PTP1B enzyme complex containing the active site with 
amino acids 1-320 inclusive, by the following procedure: 

10 The full length human PTP-1B cDNA sequence 

(published in J. Chernoff et al., PNAS, USA, supra) cloned 
into a PET vector was obtained from Dr. Raymond Erickson (Harvard 
University). The PTP-1B cDNA sequence encoding amino acids 1-320 
(Seq. ID No. 1) was amplified by PCR using the full length sequence 

15 as template. The 5' primer used for the amplification included a 
Bam HI site at the 5 T end and the 3' primer had an Eco RI site at the 
3' end, The amplified fragment was cloned into pCR2 (Invitrogen) 
and sequenced to insure that no sequence errors had been introduced 
by Taq polymerase during the amplification. This sequence was 

20 released from pCR2 by a Bam HI/Eco RI digest and the PTP-1B cDNA 
fragment ligated into the GST fusion vector pGEX-2T (Pharmacia) 
that had been digested with the same enzymes. The GST-PTP-1B 
fusion protein expressed in E. Coli has an active protein tyrosine 
phosphatase activity. This same 1-320 PTP-1B sequence (Seq. ID No. 

25 1) was then cloned into the expression vector pFLAG-2, where FLAG 
is the octa-peptide AspTyrLysAspAspAspAspLys. This was done by 
releasing the PTP-1B sequence from the pGEX-2T vector by Nco I/Eco 
RI digest, filling in the ends of this fragment by Klenow and blunt- 
end ligating into the blunted Eco RI site of pFLAG2. Site-directed 

30 mutagenesis was performed on pFLAG2-PTP-lB plasmid using the 
Chameleon (Stratagene) double-stranded mutagenesis kit from 
Stratagene, to replaced the active-site Cys-215 with serine. The 
mutagenesis was carried out essentially as described by the 
manufacturer and mutants identifed by DNA sequencing. The 

35 FLAG-PTP-1B Cys215Ser mutant (Seq. ID No. 7) was expressed, 
purified and found not to have any phosphatase activity. The GST- 
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PTP-1B Cys 21;) Ser mutant was made vising the mutated Cys 21,r> Ser 
sequence of PTP-1B already cloned into pFLAG2, as follows. The 
pFLAG2- PTP-1B Cys 2 l r >Ser plasmid (Seq. ID No. 7) was digested 
with Sal I (3' end of PTP-1B sequence), filled in using Klenow 
5 polymerase (New England Biolabs), the enzymes were heat 

inactivated and the DNA redigested with Bgl II. The 500 bp 3' PTP-1B 
cDNA fragment which is released and contains the mutated active 
site was recovered. The pGEX-2T-PTP-lB plasmid was digested with 
Eco RI (3' end of PTP-1B sequence), filled in by Klenow, 

10 phenol/chloroform extracted and ethanol precipitated. This DNA 
was then digested with Bgl II, producing two DNA fragments a 500 
bp 3* PTP-1B cDNA fragment that contains the active site and a 5.5 Kb 
fragment containing the pGEX-2T vector plus the 5' end of PTP-1B. 
The 5.5 Kb pGEX-2T 5' PTP-1B fragment was recovered and ligatod 

15 with the 500 bp Bgl II/Sal I fragment containing the mutated active 
site. The ligation was transformed into bacteria (type DH5a, G) and 
clones containing the mutated active site sequence identified by 
sequencing. The GST-PTP-1B Cys^lSSer mutant was overexpressed, 
purified and found not to have any phosphatase activity. 

20 

2. Preparation of Tritiated Bz-NEJJ-CONH ? 

This compound can be prepared as outlined in Scheme 1, 
below, and by following the procedures: 

25 Synthesis of N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L- 
phenylalanyl-[4-phosphono(difluoromethyl)]-L-phenylalanineamide 
(BzN-EJJ-CONHg) 

1.0 g of TentaGel® S RAM resin (RAPP polymer, - 0.2 
mmol/g) as represented by the shaded bead in Scheme 1, was treated 
30 with piperidine (3 mL) in DMF (5 mL) for 30 min. The resin 
(symbolized by the circular P, containing the remainder of the 
organic molecule except the amino group) was washed successively 
with DMF (3 x 10 mL ) and CH2CI2 (10 mL) and air dried. A solution 

of DMF (5 mL), N^-Fmoc-44diethylphosphono-(difliioromethyl)]-L- 
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phenylalanine (350 mg) , where Fmoc is 9-fluorenylmethoxycarbonyl, 
and 047-azabenzotriazol-l-yl)-14,3,3-tctramcthyluranium 
hexafluorphosphate,(acronym being HATU, 228 mg) was treated 
with diisopropyl-ethylamine (0.21 mL) and, after 15 min., was added 
5 to the resin in 3 mL of DMF. After 1 h, the resin was washed 

successively with DMF (3x10 mL) and CH2CI2 (10 mL) and air dried. 

The sequence was repeated two times, first using N°°-Fmoc-4- 
[diethylphosphono-(difluoromethyl)]-L-phenylalamine and then 
using N-Fmoc-L-glutamic acid gammas-butyl ester. After the final 

10 coupling, the resin bound tripeptide was treated with a mixture of 
piperidine (3 mL) in DMF (5mL) for 30 min. and was then washed 
successively with DMF (3x10 mL) and CH2CI2 (10 mL) and air dried. 

To a solution of benzoic acid (61 mg) and HATU (190 mg) 
in DMF (1 mL) was added diisopropylethylamine (0.17 mL) and, after 

15 15 min. the mixture was added to a portion of the resin prepared 
above (290 mg) in 1 mL DMF. After 90 min. the resin was washed 
successively with DMF (3 x 10 mL) and CH2CI2 (10 mL) and air dried. 
The resin was treated with 2 mL of a mixture of TFA: water (9:1) and 
0.05 mL of triisopropylsilane (TIPS-H) for 1 h. The resin was filtered 

20 off and the filtrate was diluted with water (2 mL) and concentrated in 
vacuo at 35°C. The residue was treated with 2.5 mL of a mixture of 
TFA:DMS:TMSOTf (5:3:1) and 0.05 mL of TIPS-H, and stirred at 25°C 
for 15 h. (TFA is trifluoroacetic acid, DMS is dimethyl sulfate, 
TMSOTf is trimethylsilyl trifluoromethanesulfonate). 

25 The desired tripeptide, the title compound, was purified 

by reverse phase HPLC (C18 column, 25 x 100 mm) using a mobile 
phase gradient from 0.2% TFA in water to 50/50 acetonitrile/0.2% 
TFA in water over 40 min. and monitoring at 230 run. The fraction 
eluting at approximately 14.3 min. was collected, concentrated and 

30 lyophylized to yield the title compound as a white foam. 
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Synthesis of N-(3,5-Ditritio)benzoyl-L-glutamylT4-phosp 

methyl )]-L-phcnylalanyl-[4-phosphono(dilfuoromethyl)]-L-phenyl- 

alanineamide 

The above procedure described for the preparation of 
5 BzN-EJJ-CONH2 was repeated, but substituting 3,r)-dibromobenzoic 

acid for benzoic acid. After HPLC purification as before, except using 
a gradient over 30 min. and collecting the fraction at approximately 
18.3 min., the dibromo containing tripeptide was obtained as a white 
foam. 

10 A portion of this material (2 nig) was dissolved in 

methanol/triethylamine (0.5 mL, 4/1 ), 10% Pd-C (2 mg) was added, and the 
mixture stirred under an atmosphere of tritium gas for 24 h. The mixture was 
filtered through celite, washing with methanol and the filtrate was 
concentrated. The title compound was obtained after purification by semi- 

15 preparative HPLC using a CI X column and an isocratic mobile phase of 

acetonitrile/().29f TFA in water (15:100). The fraction cluting at approximately 
5 min. was collected and concentrated in vacuo. The title compound was 
dissolved in 10 mL of methanol/water (9:1 ) to provide a 0.1 mg/mL solution of 
specific activity 39.4 Ci/mmol. 

20 
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SCHEME 1 



0 




OMe 

TentaGel® S RAM polymer 

H0 2 C^NHFmoc 



^k.PO(OEt) 2 
F F 



HATU, (/-Pr) 2 NEt, DMF 
2. piperidine, DMF 

O 




PO(OEt) £ 



F F 
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SCHEME 1 CONT'D 




X 



HATU, (/-Pr) 2 NEt, DMF 
2. piperidine, DMF 




F F 
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SCHFMF 1 CONT'D 



1. TFA-H 2 0 (9:1) 

2. TFA-DMS-TMSOTf-TIPSH 

3. HPLC purification 



4. for X = Br: T 2 (g), 10%Pd-C 
MeOH, Et 3 N; 
HPLC purification 




PO(OH) 2 



F F 

X = HorT 



By following the above described procedure for BzN-EJJ- 
CONH2, the following other peptide inhibitors were also similarly 

5 prepared: 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenyl- 
alanyl-[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
N-Acetyl-L-glutamyl-t4-phosphono(difluoromethyl)]-L-phenylalanyl- 
[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 

10 L-Glutamyl-f4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Lysinyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Serinyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-[4- 

15 phosphono(difluoromethyl)]-L-phenylalanine amide, 

L-Prolinyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, and 
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L-Isoleucinyl-[4-ph()yph()no(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono' difluoromethyl )]-L-phenylalanine amide. 

4. Phosphatase Assay Protocol 

5 

Materials: 

EDTA - ethylenediaminetetraacetic acid (Sigma) 
DMH - N,N'-dimethyl-N,N'-bis(mercaptoacetyl)- 
hydrazine (synthesis published in J. Org. Chem. 56, pp. 2332- 

10 2337, ( 1991) by R. Singh and G.M. Whitesides and can be substituted 
with DTT - dithiothreitol Bistris - 2,2-bis(hydroxymethyl)2,2\2 M - 
nitrilotriethanoW Sigma) Triton X-100 - octylphenolpoly(ethylene- 
glycolether) 10 (Pierce) Antibody: Anti-glutathione S-transferase 
rabbit (H and L) fraction (Molecular Probes) Enzyme: Human 

15 recombinant PTP1B, containing amino acids 1-320, (Seq. ID No. 1) 
fused to GST enzyme (glutathione S-transferase) purified by affinity 
chromatography. Wild type (Seq. ID No. 1) contains active site 
cysteine(215), whereas mutant (Seq. ID No. 7) contains active site 
serine(215). 

20 Tritiated peptide: Bz-NEJJ-CONH2, Mwt. 808, empirical 

formula, C32H32T2O12P2F4 

Stock Solutions 



25 (10X) Assay Buffer 



500 mM Bistris (Sigma), pH 6.2, 



MW=209.2 
20mM EDTA (GIBCO/BRL) 
Store at4°C. 



Prepare fresh daily: 



Assay Buffer (IX) 
(room temp.) 



50 mM Bistris 

2 mM EDTA 

5 mM DMH (MW=208) 
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Enzyme Dilution 

Buffer (keep on ice) 50 mM Bistris 



2 mM EDTA 

5 mM DMH 

20% Glycerol (Sigma) 

0.01 mg/ml Triton X-100 (Pierce) 



Antibody Dilution 

Buffer (keep on ice) 50 mM Bistris 

10 2 mM EDTA 



IC 50 Binding Assay Protocol: 

Compounds (ligands) which potentially inhibit the 
binding of a radioactive ligand to the specific phosphatase are 
15 screened in a 96- well plate format as follows: 

To each well is added the following solutions @ 25°C in 
the following chronological order: 

1. 110 \x\ of assay buffer. 
20 2. 10 |il. of 50 nM tritiated BzN-EJJ-CONH2 in assay 

buffer (1X)@25°C. 

3. 10 |il. of testing compound in DMSO at 10 different 
concentrations in serial dilution (final DMSO, about 5% v/v) in 
duplicate @ 25°C. 

25 4. 10 of 3.75 |ig/ml purified human recombinant 

GST-PTP1B in enzyme dilution buffer. 

5. The plate is shaken for 2 minutes. 

6. 10 of 0.3 |ig/ml anti-glutathione S-transferase 
(anti-GST) rabbit IgG (Molecular Probes) diluted in antibody dilution 

30 buffer @25°C. 

7. The plate is shaken for 2 minutes. 

8. 50 |il. of protein A-PVT SPA beads (Amersham) 

@ 25°C. 

9. The plate is shaken for 5 minutes. The binding 
35 signal is quantified on a Microbeta 96-well plate counter. 

10. The non-specific signal is defined as the enzyme- 
ligand binding in the absence of anti-GST antibody. 
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11. 1009^ binding activity is defined as the enzyme- 
ligand binding in the presence of anti-GST antibody, but in the 
absence of the testing ligands with the non-specific binding 
subtracted. 

5 12. Percentage of inhibition is calculated accordingly. 

13. IC50 value is approximated from the non-linear 

regression fit with the 4-parameter/multiple sites equation (described 
in: "Robust Statistics", New York, Wiley, by P.J. Huber (1981) and 
reported in nM units. 
10 14. Test ligands (compounds) with larger than 90% 

inhibition at 10 )iM are defined as actives. 



The following Table I illustrates typical assay results of 
examples of known compounds which competitively inhibit the 
1 5 binding of the binding agent, BzN-E J J-CONH2. 
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Preparation of Cathcpsin K(Q2> Mutant (CAT-K Mutant) 

Cathepsin K is a prominent cysteine protease in human 

osteoclasts and is believed to play a key role in osteoclast-mediated 

5 bone resorption. Inhibitors of cathepsin K will be useful for the 

treatment of bone disorders (such as osteoporosis) where excessive 

bone resorption occurs. Cathepsin K is synthesized as a dormant 

1 15 

preproenzyme (Seq. ID No. 4). Both the pre-domain (Met -Ala ) and 

the prodomain (Leu 16 -Arg 114 ) must be removed for full catalytic 

115 329 

10 activity. The mature form of the protease (Ala -Met ) contains 

139 

the active site Cys residue (Cys ). 

The mature form of cathepsin K is engineered for 

expression in bacteria and other recombinant systems as a Met 

Ala 115 -Met 329 construct by PCR-directed template modification of a 

15 clone that is identified. Epitope-tagged variants are also generated: 

(Met[FLAG]Ala 115 -Mct 329 and Met Ala 115 -Met 329 [FLAG]; where 

FLAG is the octa-peptide AspTyrLysAspAspAspAspLys). For the 

purpose of establishing a binding assay, several other constructs are 

generated including Met[FLAG]Ala 115 -[Cys 139 to Ser 139 ]-Met 329 and 

20 Met Ala 1 15 -[Cys 139 to Scr 139 ]-Met 329 [FLAG] (where the active site 

115 139 

Cys is mutated to a Ser residue), and Met[FLAG]Ala -[Cys to 
Ala 139 ]-Met 329 and Met Ala 115 -[Cys 139 to Ala 139 ]-Met 329 [FLAG] 
(where the active site Cys is mutated to an Ala residue). In all cases, 
the resulting re-engineered polypeptides can be used in a binding 

25 assay by tethering the mutated enzymes to SPA beads via specific 

anti-FLAG antibodies that are commercially available (IDI-KODAK). 
Other epitope tags, GST and other fusions can also be used for this 
purpose and binding assay formats other than SPA can also be used. 
Ligands based on the prefered substrate for cathepsin K (e.g. Ac-P2- 

30 Pi, Ac-P2-Pl-aldehydes, Ac-P2-Pl-ketones; where PI is an amino 
acid with a hydrophilic side chain, preferably Arg or Lys, and P2 is 
an amino acid with a small hydrophobic side chain, preferably Leu, 
Val or Phe) are suitable in their radiolabeled (tritiated) forms for 
SPA-based binding assays. Similar binding assays can also be 

35 established for other cathepsin family members. 
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Preparation of Apopain (capsase-3) Mutant 

Apopain is the active form of a cysteine protease 
belonging to the capsase superfamily of ICE/CED-3 like enzymes. It 
is derived from a catalytically dormant proenzyme that contains both 
5 the 17 kDa large subunit (pl7) and 12 kDa (pl2) small subunit of the 
catalytically active enzyme within a 32 kDa proenzyme polypeptide 
(p32). Apopain is a key mediator in the effector mechanism of 
apoptotic cell death and modulators of the activity of this enzyme, or 
structurally-related isoforms, will be useful for the therapeutic 

If) treatment of diseases where inappropriate apoptosis is prominent, 
e.g., Alzheimer's disease. 

The method used for production of apopain involves 
folding of active enzyme from its constituent pl7 and pl2 subunits 
which are expressed separately in E. coli. The apopain pl7 subunit 

15 (Ser 29 -Asp 175 ) and pl2 subunit (Ser 176 -His 277 ) are engineered for 
expression as MetSer 4 -Asp and MetSer -His constructs, 
respectively, by PCR-directed template modification. For the purpose 
of establishing a binding assay, several other constructs are 
generated, including a MetSer 29 -[Cys 163 to Ser 163 ]-Asp 175 large 

20 subunit and a Met^tCys 163 to Ser 163 ]-His 277 proenzyme. In the 
former case, the active site Cys residue in the large subunit (pl7) is 
replaced with a Ser residue by site-directed mutagenesis. This large 
subunit is then re-folded with the recombinant pl2 subunit to 
generate the mature form of the enzyme except with the active site 

25 Cys mutated to a Ser. In the latter case, the same Cys to Ser 

mutation is made, except that the entire proenzyme is expressed. In 
both cases, the resulting re-engineered polypeptides can be used in a 
binding assay by tethering the mutated enzymes to SPA beads via 
specific antibodies that are generated to recognize apopain (antibodies 

30 against the prodomain, the large pl7 subunit, the small pl2 subunit 
and the entire pl7:pl2 active enzyme have been generated). Epitope 
tags or GST and other fusions could also be used for this purpose and 
binding assay formats other than SPA can also be used. 

Ligands based on the prefered substrate for apopain 

35 (varients of AspGluValAsp), such as Ac- AspGluValAsp, Ac- 

AspGluValAsp-aldehydes, Ac-AspGluValAsp-ketones are suitable 
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in their radiolabeled forms for SPA-based binding assays. Similar 
binding assays can also be established for other capsasc family 
members. 

5 DESCRIPTION OF THE SEQUENCE LISTINGS 

SEQ ID NO. 1 is the top sense DNA strand of Figures 2A and 2B 
for the PTP1B tyrosine phosphatase enzyme. 

10 SEQ ID NO. 2 is the amino acid sequence of Figures 2A and 2B for 
the PTP1B tyrosine phosphatase enzyme. 

SEQ ID NO. 3 is the top sense cDNA strand of Figures 3A, 3B and 
3C for the Cathepsin K preproenzyme. 

15 

SEQ ID NO. 4 is the amino acid sequence of Figures 3A, 3B and 3C 
for the Cathepsin K preproenzyme. 

SEQ ID NO. 5 is the top sense cDNA strand of Figures 4A and 4B 
20 for the CPP32 apopain proenzyme. 

SEQ ID NO. 6 is the amino acid sequence of Figures 4A and 4B 
for the CPP32 apopain proenzyme. 

25 SEQ ID NO. 7 is the cDNA sequence of the human PTP- IB 1-320 
Ser mutant. 

SEQ ID NO. 8 is the amino acid sequence of the human 
PTP- IB 1-320 Ser mutant. 



30 



35 



SEQ ID NO. 9 is the cDNA sequence for apopain C163S mutant. 

SEQ ID NO. 10 is the amino acid sequence for the apopain C163S 
mutant. 

SEQ ID NO. 11 is the large subunit of the heterodimeric amino acid 
sequence for the apopain C163S mutant. 
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SEQ ID NO. 12 is the cDNA sequence for the Cathepsin K C139S 
mutant. 

SEQ ID NO. 13 is the cDNA sequence for the Cathepsin K C139A 
mutant. 

SEQ ID NO. 14 is the amino acid sequence for the Cathepsin K 
C139S mutant 

SEQ ID NO. 15 is the amino acid sequence for the Cathepsin K 
C139A mutant. 
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(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA : 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vi ii} ATTORNEY / AGENT INFORMATION : 

(A) NAME: NORTH, ROBERT J 

;E} REGISTRATION NUMEER: 27,36 6 

(C) PEFEPENCE / DOCKET NUMBER: 19 824 PCT 

(ix) TELECOMMUNICATION INFORMATION: 
(A; TELEPHONE: 732-594-7262 

(B) TELEFAX: 7 3-2 -594 -4720 
(Z) TELEX: 

(2) 1 1 ."FORMATION FOR SEQ ID NO : 1 : 

: SEQUENCE CHARACTERISTICS : 
( A ) LENGTH : 9 6 3 base pa l r s 
- B ) TYPE: nucleic acid 
iC. STRANDEDNESS : single 
D) TOPOLOGY: linear 



-31- 

SUBSTITUTE SHEET (RULE 26) 



WO 98/20156 



PCT/CA97/00825 



m •: k 





se'."en; ~ cr 


g g? IP'T: c-r: 


ATG. G AG AT G ^ - 


AAA^'v^jG A^jT . 


- o.~HJV,rtort , ... 


_ A^> o AT.-. - — 




GAGTGACTTG 


.^HM^H v_ o 


a?Aj GTA -Aj 


AGACGTCAGT 


C AA G AA G A T A 


A i oA.GT.--.TAT 


. _o-_ . rto i. 


T A C A T T 0 TT A 


- ' - '■- rt^uG . ._ 


CTTGC CT AA G 


' j A ; jj AG r^'-L-rv.— l 


G _ ,*v o ■ Lro TG T 


■ . GTl ATG ( . T G 


TG CGC AC AAT 


aotggscasa 


AAAAG AA' j AA 


aaattaacat 


tgatctotga 


agatatoaag 


gaaaacctta 


'.- AAC C C AAGA 


aastcgagag 


GACTTTGGAG 


- — -To .-^-i T - 


accagg :tca 


TCAGGGTCAC 


TCAGCCCGGA 


hacggg ico 


AGGTCTGGAA 




^gctgatacc 




TTGATATGAA 


;aaagtgctg 


AT 2C AG AC AG 


'- 3GACCAGCT 


hgcttotcc 


ATCATGGGGG 


AGTCTTGGGT 


goaggatcag 


CC2CCACCCG 


AGCATATHC 


ICCACCTCCC 



«j A'_ AAgT 1 - G Go AoCTooo _ o'o i _ >_ .nTTTA 1 - ~ n 

CCATGTAGAG TGGGCAAGGT T2CTAAGAAC 120 

GCGTTTGAGC ATAGTCGGAT TAAACTAC AT 180 

TTGATAAAAA TGGAAGAAG2 CCAAAGGAGT 24 0 

AGATGCGGTG ACTTTTGGGA GATGGTGTGG 3H 

AACAGAGTGA TGGAGAAAGG TTCGTTAAAA 3 6 0 

AAAG AGATHA T 3TTTGAA ^A GAC.AAATTTG 420 

TCATATTATA CAGTGCGACA GCTAGAATTG 4 80 

ATCTTACATT T 3 GAG TATA 2 3AGATGGCCT 540 

ACCAGC1TCA TTCTTGAA2T TTCTTTT3AA AGTGGGAGAG 600 

3ACGGG3C0 GTTGTGGTGC A 3TGGAGT3G AGGCATCGG3 660 

TG3CTCCTG0 TGATGGAGAA 2AGGAAAGAC 72 0 

TTAGAAAT2A GJAAGTTTOG SATGGGGTTG 780 

TA2CTG GCTG T2AT2GAAGG TGC IAAATTC 340 

TGGAAGGAGG TTTCCCACGA SGA2CTGGAG 90C 

GGGCCACCGA AACGAATCCT GGASCCACAC 9 60 

963 



;2; INFORMATION FOR SE^ ID NO : 2 : 

i SEQUENCE CHARACTERISTICS: 
: A .:■ LENGTH : 32: amino arids 
; :B: TYPE: amino acid 
t C j 5 TR ANDE ONE 3 S : single 
;D; TOPOLOGY: linear 

li' MOLECULE TYPE: peptide 



'XI : SEQUENCE DESSRIPTION: b.nQ A D NO : ^ : 

Met Glu Met. Glu Lvs Glu Phe Slu Gin lie Asp Lys Sor Sly Ser Trp 

1 : 10 15 

Ala Ala He Cyr Gin Asp lie Arg His Glu Ala Ser Asp Phe Pro Cys 

20 2S 30 

Arg Val Ala Lys Lou Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp 

35 40 45 

Val Ser Pro Phe As-o His Ser Arg He Lys Leu His Gin Glu Asp Asn 

5 0 5 5 6 0 

Asp Tyr He Asn Ala Ser Leu He Lys Met. Glu Glu Ala Gin Arq Ser 
6 r -0 H 6 0 

Tyr IH Leu Thi Gin uiv ?r-~> Leu Pro Asn Thr Cys Gly H l o Pne Trp> 
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Slu Me: 

7a 1 Met 

Glu Glu 
13 C 
lie Ser 
145 

Glu Asn 

Thr Thr 

Asn Phe 

Glv Pro 
21C 
Phe Cys 
225 

Pro Ser 
Arg Me: 
Ala Val 

Asp Gin 

29 0 
His lie 
3 05 



YH Trp Glu Gin Lys Ser Arg Gly Val Val 

100 105 
Hu Lys Gly Leu Lys Cys Ala Gin Tyr 

115 ' ' 120 

Lys Glu Me: He Phe Glu Asp Thr Asn Leu 
13 5 14 0 

Glu Asp lie Lys Ser Tyr Tyr Thr Val Arg 

150 155 
Leu Thr Thr Gin Glu Thr Arg Glu lie Leu 

165 1*7 0 

Trp Pro Asp Phe Gly Val Pro Glu Ser Pro 

130 185 
Leu Phe Lys Val Arg Glu Ser 31 y Ser Leu 
195 ' 200 

Val Val Val His Cvs Ser Ala Gly He Gly 
2.5 220 
Leu Ala Asp Thr Cys Lou Leu Leu Met Asp 

230 235 
Ser Val Asr Tie Lys Lys Val Leu Leu Glu 

24 5 250 
Gly Leu He Gin Thr Ala Asp Gin Leu Arg 

260 265 
He Glu Gly Ala Lys Phe He Met Gly Asp 
275 ' * 280 

Trp Lys Glu Leu Ser His Glu Asp Leu Glu 

295 300 
Pro Pro Pro Pro Ara Pro Pro Lys Arg He 
310 315 



Me: Leu Asn 

110 

Trp Pro Gin 
125 

Lys Leu Thr 

Gin Leu Glu 

His Phe His 

i n c 

X i -> 

Ala Ser Phe 
190 

Ser Pro Glu 
205 

Arg Ser Gly 
Lys Arg Lys 
Met Arg Lys 

i c; c, 

Phe Ser Tyr 

270 
Ser Ser Val 
285 

Pro Pro Pro 
Leu Glu Pro 



Arg 

Lys 

Leu 

Leu 
160 
Tyr 

Leu 

H is 

Thr 

Asp 
24 0 
Pne 

Leu 

Gin 

Glu 

His 

320 



(2) INFORMATION FC F. SEQ ID NO : 3 : 

; i . SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 1669 base pairs 
(3) TYPE: nucleic acid 
( C ) S TRANDEI >NES S : single 
(D) TOPOLOGY: linear 

ill) MOLECULE TYPE; cDNA 

(Xl) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



GAAACAAGCA 


CTGGATTCCA 


TATCCCACTG 


CC.AAAACCGC 


ATGGTTCAGA 


TTATCGCTAT 


60 


TGCAGCTTTC 


ATCATAATAC 


AOACCTTTGC 


TGCCGAAACG 


AAGCCAGACA 


ACAGATTTCC 


120 


ATCAGCAGGA 


TGTGGGGGCT 


CAAGGTTGTG 


CTGCTACCTG 


TGGTGAGCTT 


TGCTCTGTAC 


180 


CCTGAGGAGA 


TACTGGACAC 


CCACTGGGAG 


CTATGGAAGA 


AG AC C CAC AG 


GAAGCAATAT 


240 


AACAACAAGG 


TGGATGAAAT 


CTCTCGGCGT 


TTAATTTGGG 


AAAAAAACCT 


GAAGTATATT 


3 00 


TCCATCCATA 


ACCTTGAGGC 




GTCCATACAT 


ATGAACTGGC 


TATGAACCAC 


360 


CTGGGGGACA 


TGACCAGTGA 


A k Jrt-j-j l^'j 1 i 


u AG A-rtGATGrt 


CTSGACTCAA 


AGTACCCCTG 


420 


f J 1 i"' 'pr^ ^ynT 1 ' " f— '~* 


GCAGTAATGA 


CAC 2 0TTTAT 


ATCCCAGAAT 


GGGAAGGTAG 


AGS 2CCAGAC 


480 


TCTGTCGACT 


ATCGAAAGAA 


AGGATATGTT 


ACTCCTGTCA 


AAAATC AGGG 


TCA STGTGGT 


540 


TCCTGTTGGG 


CTTTTAGCTC 


TGTGGGTGCC 


CTGGAGGGCC 


AACTCAAGAA 


gaaaactgg: 


600 


AAACTCTTAA 


ATCTGAGTCC 


CCASAACCTA 


GTGGATTGTG 


TGTCTGAGAA 


TGATGGCTGT 


6 0 



-33- 



SUBSTITUTE SHEET (RULE 26) 



WO 98'20156 



PC T CA97/00825 



a T - . AA T AT -'TV- - Aon Ao."..-^ - 



































CGTCTCTGTG 




CAAGCCTGAC 


ZZ'C CTTCC AG 














GAAGCATGGG 


J 6 fi 








(00 A A A 1 '" A AGO 


*^ Cj Cjy f \ T 


T AAAAA 2 A G G 










/A i — ^ i — rtT*-~rV_5 






1 0 3 0 








A^A _^ n 1 o a ^ n 




TAAATCCAT I' 






ATTTCTTSCA 


'"GATGGTSP A 


j * /-\AC _j »aT 


GG A CTTTGGA 




; 2 00 


T^TOCTAITT 


TTGAAGCAGA 


TGTGGTGATA 


CT3AGATTGT 


ctgttcagtt 




12 6 0 


rpr-irp. ■~^ rr " l [~^ ""'"IT 


AAATGATC C T 


TCCTA 2TTTG 


CTTCTCTCCA 


g:gatgacct 


TTTTCACTGT 


: 2 o 


ggc iatvags 


A'T'T ~TG 


a :ag :tgtgt 


ACTCTTAGGC 


TAAGAGATGT 


GACTACAGCC 


1330 


tgcccctgas 


TGTGTTGTCC 


CAGGGCTGAT 


GCTGTACAGG 


TA _AG'juT.j^ 


AGATTTTCAC 


14 40 


/iTA'jGTTA'J.t 


ttitcattsa 


:gggactagt 


TAGGTTTAAG 


r* ^ ■» , > ^ 


GACTAGGGTA 


1500 


ATCTGACTCV 


TCACTTC2TA 


agtt:ccttc 


TATATCCTCA 


AG G TAG AAA T 


GTCTATGTTT 


1560 


TCTACTC C AA 


TT: ATAAATC 


tattcataag 


TCTTTGGTAC 


AAGTTTACAT 


GATAAAAAGA 


1620 


AATGTGATTT 


GTCTTCCCTT 


CTTTGCACTT 


TTGAAATAAA 


GTATTTATG 




16 6 9 




INF : POTION 


FDR SEQ ID 


NO : 4 : 









i SEQUENCE CHARACTERISTICS: 
;.A) LENGTH: 32 9 amino acids 
; 3 ) TV PF : arr.i r. o acid 

iC; STRANDEDNESS : single 
(D) TCPOL0GY: linear 

ii MOLECULE TYPE: peptide 

;xi; SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Met. Trp Gly Leu Lys Val Leu Leu Leu Pro Val Val Ser ?he Ala Leu 

1 5 10 IB 

Tyr Pro Glu Glu He Leu Asp Thr His Trp Clu Leu Trp Lys Lys Thr 

2 0 2 5 3 0 

His Arg Lys Gin Tyr Asr. Asn Lys Val Asp Glu He Ser Arc Arg Leu 

3 b 4 0 4 5 

He Trp Glu Lys Asn Leu Lys Tyr He Ser lie His Asn Leu Glu Ala 

50 55 60 

Ser Leu Gly Val His Thr Tyr Glu Leu Ala Met Asn His Leu Gly Asp 
65 70 ^5 80 

Mot Thr Ser Glu Glu Val Val Gin Lys Mot. Thr Gly Leu Lys Val Pro 

8 5 9 0 9 E. 

Lou Ser His Ser Arg Ser Asn Asp Thr Leu Tyr 11= Pro Clu Trp Glu 

ioo 105 in 

Gly Aru Ala Pro Asp Ser Val Asp Tyr Arg Lys Lys Gly Tyi Val Thr 
115 120 12^ 

Pro Val Lys Asn Gin Gly Gin Cys Gly Ser ' ys Trp A.a Ph~ Ser Ser 

i ~ - : } c 1 ' ■- 
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7a 1 


Giy 


Aia 


Lou 


G i u 


Giy 


Gin 


Lev: 


Lys 


Lys 


I uc 

'-'I ° 


Thr 


Giy 


Lys 


Leu 


Leu 


145 










1 5 0 










C C 
^ J -> 










1 k 0 


Asn 


[,f':j 


S e r 


Pro 


J : - 

1 cz 


Asn 


Leu 


V a 1 


Asp 


Cys 
1 7 u 


•J d i 


Ser 


G 1 u 


Asn 


Ar p 
1 ~5 


G 1 v 


Cys 


Giy 


Giy 


Giy 


Tyr 


Met 


Thr 


Asn 


Ala 


Phe 


\j ^ n 


is f r 

i v I 


Val 


Gin 


Lys 


As n 






180 










lo J 










19 0 






Arg 


Giy 


lie 


Asp 


o e i 


Giu 


Asp 


Ala 


TV r 


Pro 


1 /I 


Val 


Glv 


Gin 


Giu 


Giu 






19 5 








200 










9 0 ^ 








Ser 


Cys 


M r ' t 


Tyr 


Asn 


Pro 


Thr 


Giy 


Lys 


Ala 


Ala 


Lys 


Cys 


Arg 


Giy 


Tyr 




210 










215 










Tin 

z. U 




r\± ex 


Va 1 


Ala 

240 


Arg 

2 2 5 


Giu 


lie 


Pro 


Giu 


Sly 


Asn 


Giu 


Lys 


Ala 


Leu 

235 


Lys 


Aro 


Arg 


Va 1 


— . i , . 
■j 1 


Pro 


v n ^ 




Val 


Ala 


lie 


Asp 


Ala 


Ser 


Leu 


Thr 


Ser 


Phe 








245 










250 










255 




Gin 


Phe 


Tyr 


Ser 


Lys 


Giy 


Val 


Tyr 


Tyr 


Asp 


Giu 


Ser 


Cys 


Asn 


Ser 


Asp 






260 










255 










27 0 




Giy 


Asn 


Leu 


Asn 


His 


Aia 


V a ^ 


Leu 


Ala 


Val 


Giy 


Tyr 


Giy 


lie 


Gin 


Lys 




27 5 










280 










285 








Asn 


Lys 

2 9 0 


H: s 


Trp 


T ". o 


lie 


Lys 
295 


Asn 


Ser 


Trp 


G ly 


Giu 

300 


Asn 


Trp 


Giy 


Asn 


Lys 
305 


Giy 




Tie 


Leu 


Met 


Aia 


Arg 


Asn 


Lys 


Asn 


As:: 


Ala 


7y s 


G 1 y 


He 








310 










315 










3 20 


Ala 


Asr. 


Leu 


Ala 


r -i 


Phe 


Pro 


Lys 


Met 

















;2) INFORMATION FOR SEQ ID NO : 5 : 

( : ) SEQUENCE CHARACTERISTICS : 
(A) LENGTH : iOOI base pairs 
( B } TYPE: nucleic acid 
!Ci S7F.AIIDEDUESS: single 
ID; TCPOLCGV: linear 

111* MCLECVLE TYPE: cDNA 

!:<:■ SEQUENCE DESCRIPTION: SEQ ID NO : 5 



60 



CTGCAGGAAT TCGGCACGAG GGGTGCTATT GTGAGGCGGT TGTAGAAGTT AATAAAGGTA 
TCCATGGAGA ACACT3AAAA CTCAGTGGAT TCAAAATCCA TTAAAAATTT GGAACCAAAG 
ATCATACATG GAAGCGAAT Z AATGGACTCT GGAATATCCC TGGACAACAG TTATAAAATG 
GATTATCCTG AGATGGGTTT ATGTATAATA ATTAATAATA AGAATTTTCA TAAGAGCACT 
GGAATGACAT CTCGGTCTGG TACAGATGTC GATGCAGCAA ACCTCAGGGA AACATTCAGA 
AACTTGAAAT ATG AAGTCAG GAATAAAAAT GATCTTACAC GTGAAGAAAT TGTGGAATTG 3 60 
ATGCGTGATG TTTCTAAAGA AGATCACAGC AAAAGGAGCA GTTTTGTTTG TGTGCTTCTG 
AGCCATGGTC AAGAAGGAAT AATTTTTGGA ACAAATGGAC CTGTTCACCT GAAAAAAATA 

acaaactttt tcagag:,gga TCGTTGTAGA AGTCTAACTG gaaaacccaa acttttcatt 

ATTCAGGCCT GCCGTGGTAC AGAACTGGAC TGTGGCATTG AGACAGACAG TGGTGTTGAT o0 0 
GATGACATGG CGTGTCATAA AATACCAGTG GAGGCCGACT TCTTGTATGC ATACTCCACA 
GCACCTGSTT ATTATT-TG GCGAAATTCA AAGGATGGCT CCTG 3TT2AT CCAGTCGCTT 
TGTGCCATG.: TGAAACAGTA TGCCGACAAG CTTGAATTTA TGCACATTCT TACCCSGGTT 



180 
240 

3 00 



420 
480 
54 0 



560 
720 
7 80 
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'\A/\'_ A -iAT . CATGTATTGT CH LATG'. T _ ACAAAAsjAA^. TCTATTTTTA T . .-. ... * .iA,'!'-.!.-. ' '. l 

viTu^TT^ ->T T'^oT^j'jT'TT'T TTTTA^TCT^ T ATG V_ AA i _>T i o AG AA i jATG^> i.n.nT^ i '.'>j • 

\ctgtattt: cctctlattt tgacctactc tcatgctgsa g 10 01 



:> INT" RKATICN FOR SEQ I? NC : 6 : 

: SEQUENCE CHARACTERISTICS: 
A> LENGTH: 277 air. me acids 
. E< TYPE: ammC' acid 
C' STRAHDEDNESS : single 
C 1 TOPOLOGY: linear 

- MOLECULE TYPE : peptide 

x: SEQUENCE DESCRIPTION : SEQ ID NO : 6 : 

Mot Glu Asn Thr Glu Asn Ser Val Asp Ser Lys Ser He Lys Asn Leu 

1 E 10 1 E 

Glu Pre Lys lie He His Gly Ser Glu Ser Met Asp Ser Gly He Ser 

2 0 2 5 3 0 

Leu Asp Asn Ser Tyr Lys Met Asp Tyr Pro Glu Met Gly Leu Cys He 

3 E 4 C 4 5 

He Ho Asn Asn Lys Asn Phe His Lys Ser Thr Gly Met Thr Ser Arg 

50 " 55 60 

Ser Gly Thr Asp Val Asp Ala Ala Asn Leu Arg Glu Thr Phe Arg Asn 

65 7 0 7 5 8 0 

Leu Lys Tyr Glu Val Arg Asn Lys Asn Asp Leu Thr Arg Glu Glu He 

35 90 95 

Val Glu Leu Met Arg Asp Val Ser Lys Glu Asp His Ser Lys Arg Ser 

100 105 110 

Ser Phe Val Cys Val Leu Leu Ser His Gly Glu Glu Gly He He Phe 

115 120 * 125 

Gly Thr Asn Gly Pro Val Asp Leu Lys Lys lie Thr Asn Phe Phe Arg 

n: 135 140 

Gly Asp Arg Cys Arg Ser Leu Thr Gly Lys Pro Lys Leu Pne He He 
145 150 155 160 

Gin Ala Cys Arg Gly Thr Glu Leu Asp Cys Gly He Glu Thr Asp Ser 

165 " 170 175 

Gly Val Asp Asp Asp Met Ala Cys His Lys He Pro Val Glu Ala Asp 

180 185 190 

Phe Leu Tyr Ala Tyr Ser Thr Ala Pro Gly Tyr Tyr Ser Trp Arg Asn 

19 5 200 2C5 

Oei Lys Asp Gly Sei Trp Phe He Gin Ser Leu Cys Ala Met Leu Lys 

21; 215 220 

Glii Tyr Ala Asp Lys Leu Glu Phe Met His He Leu Thr Arg Val Asn 
225 230 235 240 

Arg Lys Val Ala Thr Glu Phe Glu Ser Phe Ser Phe Asp Ala Thr Phe 

245 250 255 

H:s Ala Lys Lys Gin He Pro Cys He Val Ser Me" Leu Thr Lys Glu 

260 265 270 

Leu Tyr Phe Tyr H is 
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<2- INFORMATION FOR SEQ ID NO : 7 : 

■: : SEQUENCE CHARA ITER I ST I CO : 
^A> LENGTH: 96_ : rase pairs 
i b ■ TYPE: nuclei: acid 
( 0 i STF ANTEDNESS : single 
iD'; TOPOLOGY: linear 

(11/ MCLECULE TYPE: cDNA 

(xi ; SEQUENCE DESCF IPTION : SEQ ID NO : 7 : 

ATGGAGATGG AAAAGGAGTT CGAGCAGATC GACAAGTCCG GGAGCTGGGC GGCCATTTAC 60 
CAGGATATO:: GACATGAACC CAGTGAOTTC CCATGTAGAG TGGCCAAGCT TCCTAAGAAC 120 
AAAAACCGAA A7AGGTACAG AGACGTCAGT CCCTTTGACC ATAGTCGGAT TAAACTACAT 180 
CAAGAAGATA ATGACTATAT OAACGCTAGT TTGATAAAAA TGGAAGAAGC CCAAAGGAGT 
TACATTCTTA CO 2AGGGCCC TTTGCCTAAC ACATGCGGTC ACTTTTGGGA GATGGTGTGG 
GAGCAGAAAA G :AGGGGTGT OGTCATGCTC AACAGAGTGA TGGAGAAAGG TTCGTTAAAA 
TGOGCAC.AAT A7TGG2CACA AAAAGAAGAA AAAGAGATGA TCTTTGAAGA CACAAATTTG 4.0 
AAATTAACAT TGATCTCTGA AGATATCAAG TCATATTATA CAGTGCGACA GCTAGAATTG 48 0 
gaaaacctta CAACSCAAGA AACTCGAGAG ATCTTACATT TCCACTATAC CACATGGCCT 
GACTTTG jAG TCCCTGAATC ACCAGCCTCA TTCTTGAACT TTCTTTTCAA AGTCCGAGAG 
TOAGGGTOAO TOAGCCOGGA GCACGGSCCC GTTGTGGTGC ACAGCAGTGC AGGCATCGGC 
AGGT "TC3 AA C~TTrTOTCT GG2T3ATACC TGCCTOCTGO TGATGGACAA GAGGAAAGAO 
CCTTCTTCCG TT--ATATOAA GAAAGTGCTG TTAGAAATGA GGAAGTTTCG GATGGGGTTG 
atccaga:ag ccg accagct GCGCTTCTCC TACCTGGCTG TGATCGAAGG TGC2AAATTC 
atcatgg^gs act:ttccgt GCAGGATCAG TGGAAGGAGC TTTCOCACGA GGACCTGGAG 
CCCCCAHCO AGCATATCCC CCCACCTCCC CGGOCACCCA AACGAATCCT GGAGCCACA7 
TGA 

(2) INFORMATION FOR SEQ ID NO : 8 : 

( i i SEQUENCE CHARACTERISTICS : 
(A) LENGTH : 322 amine acids 
!B} TYPE, amino acid 
fC) STRAIJDEDNESS : single 
iD^ TOPOLOGY: linear 

. i: j MOLECULE TYPE : peptide 

-.xi SEQUENCE DESCRIPTION: SEQ ID NO : 8 : 

Me- Glu N-t Glu Lvs Glu Phe Glu Gin He Asp Lys Ser Gly Ser Trp 

Ala Ala He TVr Gin Asp He Arg His Glu Ala Ser Asp Phe Pro Cys 

o 2 5 ^ 0 

Ara V.l Ala Lvs Leu Pro Lvs Asn Lys Asn Arg Asn Arg Tyr Arg Asp 
-. : d n 4 5 



240 

300 
3 60 



540 

60 0 
660 
720 
780 
840 
900 
960 
963 
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Phe Aso His 



La i U 

lie 
145 
Glu 

Thr 



Me t 

Met 

Glu 
130 
Ser 

Asn 

Thr 



Glu 
115 
Lys 

Glu 



31y 
Sor 
Arg 



Pro 
305 
Glu 



Pre 
2 1 0 
Gly 

Lys 

Lys 

Tyr 

Va 1 
29 0 
Pro 

Pro 



L6U 

Va 1 

Thr 

Asp 

Phe 

Leu 
275 



Trp Glu Gin 

10 0 

Lys Gly Ser 

Glu Met lie 

Asp lie Lys 
150 

Thr Thr Gin 
165 

Pro Asp Fhe 
ISO 

Pho Lys Val 

Vai Val His 

Phe Cys Leu 
230 

Pro Ser Ser 
245 

Arg Mez Gly 

260 

Ala Val He 



Val 



r'ne 
135 
Ser 

Glu 

GlV 
Arg 

Ser 
21 1 
Ala 



Gin Asp Gin Trp 

Glu His lie Pro 
310 



Glu 



105 

Lys Cys Ala 

120 

Glu Asp Thr 
Tyr Tyr Thr 

Thr Arg Hu 

170 

Val Pro Glu 

135 

Glu Ser Gly 
200 

Ser Ala Gly 
Asp Thr Cys 

Asp He Lys 

25 0 

He Gin Thr 

265 

Gly Ala Lys 
280 

Glu Leu Ser 



Hie Gin Glu 

Glu Ala Gin 

Cys Gly His 

Val Met Leu 
11C 

Gin Tyr Trp Pro 

Asn Leu Lys Leu 
14 0 

Val Arg Gin Leu 

1 5 d 

He Leu His Phe 



'rp 



Thr Leu 



Pro Pro Pro Arg 



Ser 

Se r 

He 

Leu 
23 5 
Lys 

Ala 

Phe 

His 

Pro 
315 



Pro Ala Ser 
190 

Leu Ser Pro 
20 5 

Gly Thr Cys 
220 

Leu Leu Met 
Val Leu Leu 

Asp Gin Leu 

270 

He Met Gly 
235 

Glu Asp Leu 

300 

Pro Lvs Ara 



His 
175 
Phe 

Glu 

Gly 

Asp 

Glu 
255 
Arg 

Asp 

Glu 

He 



Leu 
160 
Ty r 

Leu 

His 

Arg 

Lys 
24 0 
Met 

Phe 



ij eu 
320 



\2) INFORMATION FOP SEQ ID NO : 9 : 

U) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 1001 base pairs 
iB) TYPE: nuclei: acid 
i 2 } S 7R AMD EDME SS: single 
(D) TOPOLOGY : linear 

i : i ; MOLECULE TYPE : cDNA 

i.v;i SEQUENCE DESCRIPTION: SS^ ID NO : 9 : 



CTGCAGGAAT 


TCGGCACGAG 


GGGTGCTATT 


GTGAGGCGGT 


TGTAGAAGTT 


AATAAAGGTA 


60 


TCCATGGAGA 


ACACTGAAAA 


CTCAGTGGAT 


TCAAAATCCA 


TTAAAAATTT 


GGAACCAAAG 


no 


ATCATACATG 


GAAGCGAATC 


AATGGACTCT 


GGAATATCCC 


TGGACAACAG 


TTATAAAATG 


180 


GATTATCCTG 


AGATGGGTTT 


ATGTATAATA 


ATTAATAATA 


AGAATTTTGA 


TAAGAGCACT 


24 0 


GGAATGACAT 




TACAGATGTC 


GATGC AG C AA 


ACCTCAGGGA 


AACATTCAGA 


•: 0 0 


AA C T 1 1 G AAA T 


ATGAAGTCAG 


G AA T AAAA A T 


G A T C TT A C A C 


GTGAAGAAAT 


TGTGGAATTG 


;6C 
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ATGCGTGATG 


TTTCTAAAGA 


AGATCACAGC 










AGCCATGGTG 


AAGAAGGAAT 


AAT I ^ Tl L>'jiH 


AO AAA 1 (jbAL 


lTGTT -ukC C i 


G AAA AAA AT A 


180 


ACAAACTTTT 


TCAGAGGGGA 


i L G i Ibi A 1 ■.-> A 


» -'T'r'T'Ti 7a ■'"'TP 
A j1 - 1 AA>~ i^j 




ACTTTTCATT 


-40 


ATTCAGGCCT 


(_. - .'j> 1 'j^ji I AL 


A'jAAL i _Aj/V_ 


™G TG 3 r A TTG 


AGA r AGA'~AG 


TGGTGTTGAT 


COO 


GATGACATGG 


CGTGTCATAA 


AATACCAGTG 


GAGGCCGACT 


TCTTGTATGC 


ATACTCCACA 


660 


GCACCTGGTT 


ATTATTCTTG 


GCGAAATTCA 


AAGGATGGCT 


CCTGGTTCAT 


CCAGTCGCTT 


~20 


TGTGCCATGC 


TGAAACAGTA 


TGCCGACAAG 


CTTGAATTTA 


TGCACATTCT 


TACCCGGGTT 


780 


AACCGAAAGG 


TGGCAACAGA 


ATTTGAGTCC 


TTTTC2TTTG 


ACGCTACTTT 


TCATGCAAAG 


840 


AAACAGATTC 


CATGTATTGT 


TTCCATGCTC 


ACAAAAGAAC 


TCTATTTTTA 


TCACTAAAGA 


900 


AATGGTTGGT 


TGGTGGTTTT 


TTTTAGTTTG 


TATOC 2AAGT 


GAGAAGATGG 


TATATTTGGT 


960 


ACTGTATTTC 


C ' ^ T C ^""C A TT T 


TGACCTACTC 


TCATGCTGCA 


(3 




1001 




2} INFOPKAT 


ION FOF SEQ 


I D NO : 1 0 : 









{:) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

( C ) S TRANT-EDNES S : single 
f D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 0 : 



Met 


Glu 


Asn 


Thr 


Glu 


Asn 


Ser 


Val 


Asp 


Ser 
10 


Lys 


Ser 


He 


Lys 


Asn 

15 


Leu 


Glu 


Pro 


Lys 


lie 


He 


His 


Gly 


Set- 


Glu 


Ser 


Met 


Asp 


Ser 


Gly 


He 


Ser 








2 0 










25 










3 0 






Leu 


Asp 


Asn 


Ser 


Tyr 


Lys 


Met 


Asp 


Tyr 


Pro 


Glu 


Met 


Gly 


Leu 


Cys 


He 




35 










40 










45 








lie 


He 


Asn 


Asn 


Lys 


Asn 


Phe 


His 


Lys 


Ser 


Thr 


Gly 


Met 


Thr 


Ser 


Arg 




5C 










55 










60 










Ser 


Gly 


Thr 


Asp 


Val 


Asp 


Ala 


Ala 


Asn 


Leu 


Arg 


Glu 


Thr 


Phe 


Arg 


Asn 


65 










70 










75 










80 


Leu 


Lys 


Tyr 


Glu 


Val 


Arg 


Asn 


Lys 


Asn 


Asp 


Leu 


Thr 


Arg 


Glu 


Glu 


He 










3 5 










90 










95 




Val 


Glu 


Leu 


Met 


Arg 


Asp 


Val 


Ser 


Lys 


Glu 


Asp> 


His 


Ser 


Lys 


Arg 


Ser 








100 








105 










110 






Ser 


Phe 


Val 


Cy s 


Val 


Leu 


Leu 


Ser 


His 


Gly 


Glu 


Glu 


Hy 


He 


lie 


Phe 






115 










120 










125 








Gly 


Thr 


Asn 


Gly 


Pro 


Val 


Asp 


Leu 


Lys 


Lys 


He 


Thr 


Asn 


Phe 


Phe 


Arg 


130 










135 










140 










Gly 


Asp 


Arg 


Cy s 


Arg 


Ser 


Leu 


Thr 


Gly 


Ly s 


Pro 


Lys 


Leu 


Phe 


He 


He 


145 








150 










155 










160 


Gin 


Ala 


Ser 


Arg 


Gly 


Thr 


Glu 


Leu 


Asp 




Gly 


He 


Glu 


Thr 


Asp 


Ser 










165 










17 0 










17 5 




Gly 


Val 


Asp 


Asp 


Asp 


Met 


Ala 


Cys 


His 


Lys 


lie 


Pro 


Val 


Glu 


Ala 


Asp 






13 0 










185 










190 






Phe 


Leu 


Tyr 


Ala 


Tyr 


S e r 


Thr 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Ser 


Trp 


Arg 


Asn 






195 










200 










205 








Ser 


Lys 


Asp 


Gly 


Ser 


Trp 


Phe 


He 


Gin 




Leu 




Ala 


Met 


Leu 


Lys 



2 IS 220 
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■2 1NF0FMATI0N FOR SEQ ID NO : 1 I : 

; I ' SEQUENCE CHARACTER I ST ICS : 
i A? LENGTH : 2 7 7 ammo acids 
(3> TYPE: air. i no acid 
kZ! STRAKDEDNESS: single 
;D< TOPOLOGY: linear 

■:; MOLECULE TYPE: peptide 

SEQUENCE DESCRIPTION: SEQ ID NO : 1 1 ; 

Met Glu Asn Thr Glu Asn Ser Val Asp Sor Lys Ser He Lys Asn Leu 

1 5 10 IS 

Glu Pro Lys Ho He His Gly Ser Glu Ser Met Asp Ser Gly lie Ser 

2 0 2 5 3 0 

Leu Asp Asn Ser Tyr Lys Met Asp Tyr Pro Glu Met Gly Leu Cys lie 

25 40 4S 

He He Asn Asn Lys Asn Phe His Lys Ser Thr Gly Met Thr Ser Arc 

SO ^ 5S 6 0 

Ser Glv Thr Asp Val Asp Ala Ala Asn Leu Arg Glu Thr Phe Arg Asn 
6S '70 75 80 

Leu Lys Tyr Glu Val Arg Asn Lys Asn Asp Leu Thr Arg Glu Glu He 

85 " 90 9 5 

Val Glu Leu Met Ara Asp Val Ser Lys Glu Asp His Ser Lys Arg Ser 

100 " 10S 110 

Ser Phe Val Cys Val Leu Leu Ser His Gly Glu Glu Gly He He Phe 

HS " 120 125 

Gly Thr Asn Gly Pro Val Asp Leu Lys Lys lie Thr Asn Phe Phe Arg 

130 135 140 

Gly Asp Arg Cys Arg Ser Leu Thr Gly Lys Pro Lys Leu Phe He He 
145 ISO 155 160 

Gin Ala Ser Arg Gly Thr Glu Leu Asp Cys Gly He Glu Thr Asp Ser 

165 170 175 

Gly Val Asp Asp Asp Met Ala Cys His Lys He Pro Val Glu Ala Asp 

180 185 19 0 

Phe Leu Tyr A. a Tyr Ser Thr Ala Pro Gly Tyr Tyr Ser Trp Arg Asn 

195 200 205 

Ser Lys Asp Giy Ser Trp Phe He Gin Ser Leu Cys Ala Met Leu Lys 

210 215 22 0 

Gin Tyr Ala Asp Lvs Leu Glu Phe Met His He Lou Thr Arg Val Asn 
225 ' * ' 230 235 240 

Ara Lvs Val Ala Thr Glu Phe Glu Ser Pho Ser Phe Asp Ala Thr Phe 

245 250 255 

His Ala Lys Lys Gin He Pro Cys He Val Ser Met Leu Thr Lys Glu 

260 265 270 

Leu Tyr Phe Tyr His 
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!2i INFORMATION FOR SE; ID NO: 12: 

. SEQUENCE CHARACTERISTICS : 
;A) LENGTH: 99 0 case pairs 
:B) TYPE- nucleic acid 
:;C) STRAKDEDNESS : single 
;D) TOPOLOGY : linear 

■ i i ) MOLECULE TYPE : cDNA 

[Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



ATGTGGSGGC TCAA jSTTCT GITGCTACCT GTGGTGAGCT TTGCTCTGTA CCCTGAGGAG 
AT AC TG SAC A CCCACTGGGA GCTATGGAAG AAGACCCACA GGAAGCAATA TAACAA3AAG 
GTGGATGAAA TCTCTCGSCG TTTAATTTGG GAAAAAAACC TGAAGTATAT TTC CATC CAT 



60 
120 
130 



ATGTGGGGGC TCAAGGTTCT GCTGCTACCT GTGGTGAGCT TTGCTCTGTA CCCTGAGGAG 
ATACTGGACA CCCACTGGGA GCTATGGAAG AAGACCCACA GGAAGCAATA TAACAACAAG 
GTGGATGAAA TCTCTCGGCG TTTAATTTGG GAAAAAAACC TGAAGTATAT TTC CATC CAT 
AACCTTGAGG CTTCTCTTSG TGTCCATACA TATGAACTGG CTATGAACCA CCTGGGGGAC 240 
ATGACCA jTG AAGAG3TGGT TCAGAAGATG ACTGGACTCA AAGTACCCCT GTCTCATTCC 3 0 0 
CGCAGTAATG AC AC 'I CTTTA TATCCCAGAA TGGG AAGGT A GAGC 2CCAGA CTCTSTCGAC 360 

420 
4S0 
540 
6 0 0 
66 0 
72 0 



TATCGAAAGA AAGGATATST TACTCCT^TC AAAAATCAGG GTCAGTGTGG TTCCTCTTGG 
GCTTTTAGCT CTGTGGGTGC CCTGGAGGGC CAACTCAAGA AGAAAACTGG CAAACTCTTA 
AATCTGAGTC CCCAGAACCT AGTGGATTGT GTGTCTGAGA ATGATGGCTG TGGAGGGGGC 
TACATGACCA AT3CCTTCCA ATATGTGCAG AAGAACCGGG GTATTGACTC TGAAGATGCC 
TACCCATATG TGGGACAGGA AGAGAGTTGT ATGTACAACC CAACAGGCAA GGCAGCTAAA 
TGCAGAGGGT ACAGAGAGAT CCCCGAGGGG AATGAGAAAG CCCTGAAGAG GGCAGTGG2C 
C GAG TGGG AC CTGTCTCTGT GGC TATTGAT GCAAGCCTGA CCTCCTTCCA GTTTTACAGC 780 
AAAGGTGTjT ATTATGATGA AAGCTGCAAT AGCGATAATC TGAACCATGC GGTTTTGGCA 340 
GTGGGATATG GAATCCAGAA GGGAAACAAG CACTGGATAA TTAAAAACAG CTGGGGAGAA 
AACTGGGGAA ACAAAGGATA TATCCTCATG GCTCGAAATA AGAACAACGC CTGTGGCATT 
GCCAACCTGG CCAGCTTCCC CAAGATCTSA 

(2} INFORMATION FOR SEQ II' NC:13: 

: ) SEQUENCE CHARACTERISTICS : 
(A; LENGTH : 99 0 base pairs 
(B; TYPE: nucleic acid 
(C; STRANDEDNES5 : single 
(Dj TOPOLOGY: linear 

,.1) MOLECULE TYPE: cDNA 

.xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



900 
960 
990 



6 0 
120 
1&0 
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AA. ~TA C C C ~ + 










-A— "a"aA 






| -T- r "J'Jlj-'Q^ 1 '' 




TATCGAAAGA 




^ AC T,, "' r 'TG rn '~' 


aaaaatcagg 




TTCCGCTTGG 


. - - 


GCTTTTAGCT 










CAAA CTCTTA 


4 S 0 






AGTGGATTGT 


GTGTCTGAGA 


ATGATGG CTG 


TGGA ; SGGGGC 


54 0 


TAGATGA Z "A 


ATG Z CTTCC A 




AA G AA C C G G G 


GTATTGA 2TC 


TG /aA 1 _) a\ T G G G 


COG 


TAC Z 2ATATG 


TG sgac agga 


AG A j AGTTGT 


ATGTAoAAC _ 


CAACAGG CAA 


GG GAG CT AAA 


- 6 C 


TGCAGAGGGT 


AC AG AG AG AT 


CCCGGAGGGG 


AATGAGAAAG 


CCCTGAAG AG 


GGGAGTGGCG 


~2< 


CGAGTGG 3 AG 


CTGTCTCTGT 


GGCGATTGAT 






GTTTTACAGC 


78C 


AAAGGTGTGT 


ATTATGATGA 


AAG CTGCAAT 


AGCGATAATC 


TGAACCATGC 


GGTTTTGGCA 


r' 4 0 


GTGGGATATG 


gaatccagaa 


GGG.AAAC AAG 


CACTGGATAA 


TT AAAAA GAG 


CTGGGGAGAA 


9 00 


AAC TGGGG AA 


ac aaaggata 


TATCCTCATG 


GCTCGAAATA 


AGAACAACGC 


CTGTGGCATT 


? 6 0 


GCCAACCTGG 


ccagcttccc 


caagatgtga 








990 



i 2; INFORMATION FOR SEC IC NO : 1 4 : 

i l ;■ SEQUENCE CHARACTERISTICS : 
[ A j LENGTH : 32 9 axino acids 
; B; TYPE: amino acid 
■:Ci STRANDED! rESS : single 
.D) TOPOLOGY: linear 

(li) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCFIPTION: SEQ ID NO : 1 4 : 

Met Trp Gly Leu Lvs Va 1 Leu Leu Leu Pro Val Val Ser Phe Ala Leu 

1 * " 5 10 IS 

Tyr Pre Glu Glu lie Leu Asp Thr His Trp Glu Leu Trp Lys Lys Thr 

20 25 30 

His Arg Lys Gin Tyr Asn Acn Lys Val Asp Glu lie Ser Arg Arg Leu 

35 40 45 

lie Trp Glu Lys Asn Leu Lys Tyr He Ser He His Asn Leu Glu Ala 

50 ' c 5 60 

Ser Leu Gly Val His Thr Tyr Glu Leu Ala Met Asn His Leu Gly Asp 
65 70 75 SO 

Mot Thr Ser Glu Glu Val Val Gin Lys Met Thr Gly Leu Lys Val Pro 

85 90 95 

Leu Sei His Ser Arg Ser Asn Asp Thr Leu Tyr He Pro Glu Trp Glu 

10C 105 1 1 C 

Gly Arc Ala Pro Asp Ser Val Asp Tyr Arg Lys Lys Gly Tyr Val Thr 

115 * 120 125 

Pro Val Lys Asn Gin Gly Gin Cys Gly Ser Ser Trp Ala Phe Ser Ser 

130 13 5 140 

Val Gly Ala Leu Glu Gly Gin Leu Lys Lys Lys Thr Gly Lys Leu Leu 
145 " 150 155 160 

Asn Leu Ser Pro Gin Asn Leu Val Asp ~vs Val Ser Glu Asn Asp Gly 
155 HO H : 
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l. y s 


Gly 


GLy 


GLy 


Tyr 


Met: 


Thr 


Ac i": 


Ala 


Phe 


G 1 n 


* /' 




Gin 




Asn 








1 ^0 










1 85 










1 9 0 






Arg 


Gly 




A:p 


Ser 


Glu 


Asp 


Ala 


Tyr 


Pro 


Tyr 


v a 1 


G 1 y 




Glu 


" u 






i ;-5 








2 0 0 










2 0 5 








Ser 




Met 


* j - 


Asn 


Pro 


Thr 


Gly 


Lys 


Al a 


Ai a 


Lys 




ai g 


Gly 


Tyr 




210 


















-j o n 
i, <~ u 








Ala 


Arg 


Glu 




Pre 


Glu 


Gly 


Asn 


Glu 


Lys 


Ala 




r,y o 


Zl V H 


Ala 


Val 


225 










230 










235 










240 


Arg 


Val 


G j. y 


Pi : 


Val 


Ser 


Va 1 


Ala 


lie 


Asp 




^ e r 




Thr 


Ser 


Phe 








245 










250 










255 




Gin 


Phe 


Tyr 


Ser 


Lys 


Gly 


Va 1 


Tyr 


Tt 


Asp 


VJ>1U 


bGI 


P*/ Q 


Asn 


Q \' 


Aso 








260 










265 










270 




Gly 


Asn 


Leu 


Asn 


His 


Ala 


Val 


Leu 


Ala 


Val 


Gly 


Tyr 


Gly 


He 


Gin 


Lys 






275 










280 










285 








Asn 


Lys 


His 


Trp 


lie 


lie 


Lys 


Asn 


Ser 


Trp 


Gly 


Glu 


Asn 


Trp 


Gly 


Asn 




290 










295 










3 00 






Gly 


He 


Lys 


Gly 


Tyr 


He 


Leu 


Met 


Ala 


Arg 


Asn 


Lys 


Asn 


Asn 


Ala 


Cys 


3 05 








310 










3 15 










3 2 0 


Ala 


Asn 


Leu 


Ala 


Ser 
3 25 


Phe 


Pro 


Lys 


Met: 
















( 2 ; 


INFORMATION 


FOR 


SEQ 


ID 


NO : 1 5 : 

















(i: SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 29 amino acids 

(B) TYPE: amino acid 

; Z ) STRANDEDNESS : single 
\D) TOPOLOGY: linear 



in; MOLECULE TYPE: peptide 





(y 


1 i S 


E^UENCE 


les c 


RIPT 


ION: 


SEC 


ID 


NO : 1 5 : 










Met 


Trp 


G x y 


Leu 


Lys 


Val 


Leu 


Leu 


Leu 


Pro 


Val 


Val 


Ser 


Phe 


Ala 


Leu 


1 






5 










10 










15 




Tyr 


Pro 


Glu 


Glu 


lie 


Leu 


Asp 


Thr 


His 


Trp 


Glu 


Leu 


Trp 


Lys 


Lys 


Thr 






20 










25 










30 






His 


Arg 


Ly s 


Gin 


Tyr 


Asn 


Asn 


Lys 


Va 1 


Asp 


Glu 


He 


Ser 


Arg 


Arg 


Leu 




3 5 








40 










45 








lie 


Trp 


Glu 


Lys 


Asn 


Leu 


Lys 


Tyr 


ii G 


Ser 


He 


His 


Asn 


Leu 


Glu 


Ala 




50 










5 5 










60 










Ser 


Leu 


Gly 


Va 1 


His 


Thr 


Tyr 


Glu 


Leu 


Ala 


Met 


Asn 


His 


Leu 


Gly 


Asp 


65 










70 








75 










80 


Met 


Thr 


Ser 


Glu 


Glu 


Val 


Val 


Gin 


Lys 


Met 


Thr 


Gly 


Leu 


Lys 


Val 


Pro 










85 










90 










95 




Leu 


Ser 


His 


Ser 


Arg 


Ser 


Asn 


Asp 


Thr 


Leu 


Tyr 


lie 


Pro 


Glu 


Trp 


Glu 








100 








105 










110 






Gly 


Arg 


Ala 


Pro 


Asp 


Ser 


Val 


Asp 


Tyr 


Arg 


Lys 


Ly s 


Gly 


Tyr 


Val 


Thr 






115 










120 










125 








Pro 


Val 


Ly s 


Asn 


Gin 


Gly 


Gin 


Cys 


Gly 


Ser 


Ala 


Trp 


Ala 


Phe 


Ser 


Ser 




130 










135 










140 










Val 


Gly 


Ala 


Leu 


Glu 


Gly 


Gin 


Leu 


Lys 


Lys 


Lys 


Thr 


Gly 


Lys 


Leu 


Leu 


145 








150 










155 










160 


Asn 


Leu 


Ser 


Pro 


Gin 


Asn 


Leu 


V a 1 


Asp 


Cys 


Val 


Ser 


Glu 


Asn 


Asp 


Gly 










165 










17 0 










17 r 




Cys 


Gly 


— . . 


Gly 


Tyr 


Met 


Thr 


Asn 


Ala 


Phe 


bin 


Tyr 


Val 


Gin 


Lys 


Asn 








180 










185 










190 






Arg 


Gly 


lie 


Asp 


spi- 


Glu 


Asp 


Ala 


Tyr 


Pro 


Tyr 


Val 


Gly 


Gin 


Gi j 


Glu 






" b 






20 0 










20 5 
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; i r- p h ,-- ^, 



- a! 



Ala He Asp Ala Sor Lc: Thr 



rtSD G^U . ; C1 CVS rtS 



Asn Lou Asn His Ala Val Leu Ala Yal Gly Tyr Gly He 
2^ r 23 0 28 E 



3er Asp 
2^0 

Gin Lvs Giv 



Asn Lys His Trp He He Lys Asn 

tot ^ o - 



i' Trp Gly Glu Asn Trp Gly Asn 

3 0 0 

ys Gly Tyr He Leu Met. Ala Arq Asn Lys Asn Asn Ala Cys Gly He 
3 10 3 15 3 2 0 

Fhe Pro Lvs Met 



3 05 

Ala Asn Leu 



w a e r 

Tor 
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WHAT IS CLAIMED: 

1 . A process for determining the binding ability of a 
ligand to a cysteine-containing wild-type enzyme comprising the 

5 steps of: 

(a) contacting a complex with the ligand, the complex 
comprising a mutant form of the wild-type enzyme, 
in which cysteine, at the active site, is replaced 
with serine, in the presence of a known binding 
]0 agent for the mutant enzyme, wherein the binding 

agent is capable of binding with the mutant 
enzyme to produce a measurable signal. 

2. The process of Claim 1 further comprising, the 
15 step of contacting the complex with the binding agent, in the absence 

of the ligand, to produce a first measurable signal. 

3. The process of Claim 1 wherein the signal is a 
colorimetric, photometric, spectrophotometry or radioactive signal. 

20 

4. The process of Claim 3 wherein the signal is a beta 
radiation-induced scintillation. 

5. The process of Claim 1 wherein the known 

25 binding agent is an inhibitor for the wild-type enzyme and contains a 
radionuclide to induce scintillation upon contact with the mutant 
enzyme. 

6. The process of Claim 1 wherein the complex 

30 further comprises a solid support, a scintillation agent, and a fused 
enzyme linking construct. 

7. The process of Claim 6 wherein the complex is 

further comprised of: 
35 (a) a fluopolymer bead containing a scintillation agent 

and Protein A, which is attached via Protein A to; 



-45- 



WO 98/20156 



PCT/CA97/00825 



(C) 



<b) 



an anti-GST antibody, which is further attached to 
the GST end of: 

a fused enzyme linking construct comprised of 
GST enzyme fused with the mutant enzyme. 



5 



8. The process of Claim 1 wherein the wild-type 
enzyme is selected from the group consisting of proteases, 
phosphatases, lipases, hydrolases and kinases. 



10 



9. 



The process of Claim 8 wherein the wild-type 



enzyme is selected from the group consisting of tyrosine 
phosphatases and cysteine proteases. 

10. The process of Claim 9 wherein the tyrosine 
15 phosphatase is selected from the group consisting of PTP1B, LCA, 
LAR, DLAR and DPTP. 

IT The process of Claim 11 wherein the tyrosine 
phosphatase is PTP1B which contains serine in place of cysteine at 
20 position 215. 

12. The process of Claim 11 wherein the PTP1B 
phosphatase is present in a truncated form comprising amino acids 
1-320 and containing the active binding site. 



protease is a Cathepsin or capsase. 

14, The process of Claim 13 wherein the cathepsin is 
30 selected from the group consisting of Cathepsin B, Cathepsin G, 
Cathepsin J, Cathepsin K(02), Cathepsin L, Cathepsin M and 
Cathepsin S. 



25 



13. The process of Claim 9 wherein the cysteine 



15. The process of Claim 14 wherein the cathepsin is 



35 Cathepsin K(02). 
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16. The process of Claim 11 wherein the capsase is 
selected from the group consisting of : capsase- 1 (ICE), capsase-2 
(ICH-1), capsase-3 (CPP32, human apopain, Yama), capsase- 
4(ICE re l-n, TX, ICH-2), capsase-5(ICE r el-Hl> TY >> capsase-6(Mch2), 

5 capsase-7(Mch3, ICE-LAP3, CMH-1), capsasc-8(FLICE, MACH, 
Mch5), capsase-9 (ICE-LAP6, Mch6) and capsase-10(Mch4). 

17. The process of Claim 16 wherein the capsase is 
human apopain CPP32 . 

10 

1 8. The process of Claim 1 1 wherein the tyrosine phosphatase 
is PTP1B and the binding agent is a peptide selected from the group consisting 
of: 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
15 phosphono(difluoromethyl)]-L-phenylalanineamide (BzN -EJJ-CONH2), where 
E is glutamic acid and J is 4-phosphono(difluoro-methyl)]-L-phenylalanyl; 
N-Benzoyl L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 

phosphono(difluoromethyl)]-L-phenylalanine amide; 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-[4- 

20 phosphono(difluoromethyl)l-L-phenylalanine amide; 

L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

(difluoromethyl )]-L-phenylalanine amide; 

L-Lysinyl-f4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

(difluoromethyl)]-L-phenylalanine amide; 
25 L-Serinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-f4-phosphono- 

(difluoromethyl )]-L-phenylalanine amide; 

L-Prolinyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; and 

L-lsoleucinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
30 (difluoromethyl)]-L-phenylalanine amide. 

1 9. The process of Claim 1 8 wherein the peptide is in tritiated 

form. 
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2(1. The process ot Claim 1 S wherein the peptide is tritiated N 
(3 t 5-Ditritio)benzovl-L-glutam 

alanyl-[4-phosphono(dilfuoromethvl)| L phenylalanineamide, being tritiated 
Bz-NEJJ-CONH?, wherein E as used herein is glutamic acid and J, as used 
5 herein, is the (F2Pmp) moiety, (4-phosphono-(difluoromethyl)phenylalanyl). 



21. A process for determining the binding ability of a 
ligand to a cysteine-containing wild-type tyrosine phosphatase 
comprising the steps of: 

10 (a) contacting a complex with the ligand, the complex 

comprising a mutant form of the wild-type enzyme, 
the mutant enzyme being PTP1 B, containing the 
same amino acid sequence 1-320 as the wild type 
enzyme, except at position 215, in which cysteine is 

15 replaced with serine in the mutant enzyme, in the 

presence of a known radioligand binding agent for 
the mutant enzyme, wherein the binding agent is 
capable of binding with the mutant enzyme to 
produce a measurable beta radiation-induced 

20 scintillation signal. 

22. The process of Claim 21 further comprising before step (a), 
the step of contacting the complex with the binding agent in the absence of 
the ligand to produce a first measurable beta radiation-induced scintillation 

25 signal. 



23. The process of Claim 21 wherein the binding agent is a 
peptide selected from the group consisting of: 

N-Benzoyl-L-glutamyl-f4-phosphono(difluoromethyl)J-L-phenylalanyl-[4- 
30 phosphono(difluoromethyl)l-L-phenyIalanineamide (BzN EJJ-CONH2), where 

E is glutamic acid and J is 4 phosphono(difluoro-methyl)]-L-phenylalanyl; 
N-Benzoyl-L-glutamyl-[4-phosphono(difiuoromethyl))-L-phenylalanyhf4- 
phosphono(difluoromethyl)l-L-phenylalanine amide; 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyI)]-L-phenylalanyl-[4- 
35 phosphono(difluoromethyl )] L phenylalanine amide; 

L-Glutamyhf4-phosphono(difiuoromethyl)|-L-phenylalanyl-f4-phosphono- 
(difluoromethyl)|-L phenylalanine amide; 
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L-LysinyI-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L phenylalanine amide; 

L-Serinyl-|4-phosphono(difluoromethyl)J-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 
5 L -Prolinyl-|4-phosphono(difluoromethyl)J-L-phenylalanyI-[4 phosphono- 
(difluoromethyl)l-L-phenylalanine amide; and 

L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-f4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide. 

10 24. The process of Claim 23 wherein the peptide is in tritiated 

or 1 ' 25 iodinated form. 

25. The process of Claim 24 wherein the peptide is tritiated N- 
(3,5-Ditritio)benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L- 
15 phenylalanyl-f4-phosphono(dilfuoromethyl)]-L-phenyIalanineamide, being 
tritiated Bz NEJJ CONH2, wherein E as used herein is glutamic acid and J, as 
used herein, is the (F2Pmp) moiety, (4-phosphono- 
(difluoromethyl)phenylalanyl). 

20 26. A complex comprised of: 

(a) a mutant form of a wild-type enzyme, in which 
cysteine, necessarty for activity in the active site, is 
replaced with serine and is attached to: 

(b) a solid support, 

25 27. The complex of Claim 26 further comprising: a binding 

agent for the mutant enzyme, wherein the binding agent is capable of binding 
with the mutant enzyme to produce a measurable signal. 
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28. A peptide selected from the group consisting of: 

N-Benzoyl-L-gIutajiiyi-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenyIalanineamide (BzN-EJJ-CONH2)> where 

E is glutamic acid and J is 4-pliosphono(difluoro-methyl)J-L-phenylalanyl; 
N-Benzoyl-L-glutaniyl-[4-phosphono(difluorometliyl)]-L-phenylaIanyl-[4- 
phosphono(difluoroniethyl)J-L-phenylaJanine amide; 
N-Acetyl-L-gluUimyl-[4-phosphono(difluoroniethyl)] L-phenylalanyl-f4- 
phosphono(difluoromethyl)]-L-phenylalajiine amide; 
L-GluUmyl-l4-phosphono(dinuoroincthyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)l-L-phenylalanine amide; 

L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(ditluoroinethyl)]-L-phenylalanine amide; 

L-Serinyl-[4-pliosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl )]-L-phenylalanine amide; 

L-Prolinyl-[4-phosphono(difluoromethyl)J-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)j-L-phenylalanine amide; and 

L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenylaIanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide, for use as a binding agent for a 
mutant enzyme. 



WO 98/20156 



1/8 



PCT/CA97/00825 




SUBSTITUTE SHEET (RULE 26) 



WO W20156 PC I C A9-v<MI825 



^XGAGAIGGAAAAGG^GTTCGAGCAGA lOGACAAGTXG 
i - — - * . - - 

TA," I f T ACC TTTTC3 TC AAGC T CGT CTAGCTGTTCAGGCC C TOGACC CGtYGGTAAATG 
1 Met' - 1 uMetGl uLysGl uPheGl uGl n 1 1 eAspLysSe^G 1 yGerTrpA I aAla 1 1 eTyr 



TATC ^GACATGAAGCCAGTGACTTCCCATGTAGAGTGG.XAAGC U>1 C TAAGAAC 



c i 



i +■ - 



GTCCTATAGGCTGTAG.TTCGGTCACTGAAGGGTACATCTCACCGGTTCGAAGGATTCTTG 
n Gl nAspI 1 eArgHi sGl uAlaf erAspFrieProCysArgVal AlaLysLeuPr oLysAr.n A 

aaa/\A::;::gaaataogtacagagacgtcagtccctttgaccatagtcgg.attaaactacat 

j-.j _ _ _ + +■ - - - + 

TT'TTGGCTTTATCCATGTCTCTGCAGTCAGGGAAA,CTGGTATCAGCCT^T T TGATGTA 
i 1 LysAsnArcAi,riArgTyrA<xAspVal Ser ProPheAsprii sSerArgl 1 eLysLeurii s i 0 

( - AAGAAGATAATGAf PATATCAACGCTAGTTrGATAAAAATGGAAGAAGCCCAAAGGAGT 

181 + + f -" + ---" + 2J(; 

(",TT(;TTCTATTACTGATATAGTTGCGAT(:AAACTATTTTTACCTTCTTCGGGTTT',.CTCA 

hi GlnGluAspAsnAspTyrl 1 eAsnAl aSerLeuI 1 el ysMetGluGluAlaGlnArgSer 

TACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTTGGGAGATGGTGTGG 

2A\ + +■ + + ~ 3 |:, 'i 

ATGTAAGAATGGGTCCGGGGAAACGGATTGTGTACGCCAGTGAAAACCCTGTACCAGAX 

81 TyrI 1 eLeuThrGl nGlyProLeuProAsnThrCysGlyHi sPheTr^l uMetVal 1 rp l f :(> 



360 



GAGCAGAAAAGCAGGGGTGTCGTCATGCICAACAGAGTGATGGAGAAAGGTTiXTTAAAA 

301 + +- + + 

CFGTC1 T TTGGTC (X CACAGCAGTACGAGTTGTCTCACTACCTCTTTCCAAGCAATTTT 
10] GluGlnLysSerArcGlyValValMetLeuAsnArgValMetGluLysGlySerLeuLy:, 130 



TG^GrACMTACTGGiTACAAAMGAAGAAAAAGAGATGATCTTTGAAGACACAAATTTG 

3hi --.+-.. + + + * 4;:u 

ACGCGTGTTATGACCGGTGTTTTTCTTCTTTTTCTCTACTAGAAACTTCTGTGTTTAAAC 
131 CysAI aGl nTyrXrpProGl nLysGl uGl uLysGl uMet II ePheGl uAsp IhrAsnleu 1 J 0 

/WJTAArATTGAT'":TCTGAAGATATCAAGTCATATTATACAGTGCGACAGCTAGAATTG 
42i ^ - + + + ~ 480 

rTTAATTGTAACTAGAGACTTCTATAGTTCAGTATAA.TATGTCACGCTGTCGATCTTAAC 
141 LvsLeu'hrLeuIleGerGluAspIleLysSerTyrTyrThrValArgGlnLeuGluLeu 



160 



GAAAACC'1 1ACAACCCAAGAAACTCGAGAGA XT'! AGAi: I iXACTATAC CACA I GGCCT 
481 ---- + + * + ;- + + 540 

3 ' : : TGGAA.TGTTGGGTTCTI IGAGC XKXAGAATG I AAAGGTGA , A ; GGTGTACCGGA 
161 GluAsnLeuThrThrGlnGlulhrArgGluIlel.euHisFteHis.yrThrThrTrpPro 180 



FIG.2A 



SUBSTITUTE SHEET (RULE 26) 



WO 98/20156 



PCT/CA97/00825 



1 



/3 



GA r Tl TGGAGTCCCTGAATCACCAGCCTCATTCTTGAACTT1 CT T TTCAAAG FCCbAGAG 

c 4 i + + + + + + f)i)iJ 

U ' CTGAAACf TCAGGGACTI AGTGGTCGGAGTAAGAACTTGAAAGAAAAGl TTCAGGCTC I C 

181 AssPheGlyValProGluSerProAlaSerPheLeuAsnPheLeuPheLysValArgGlu ?'iO 

TCAGGGT r A r TrAGrCCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGG(] 

6oi -+ + + - + + 



XAGTGAGTCGGGCCTCGTGCCCGGGCAACACCACGTGACGTCACGTCCGTAGCCG 



AG T C ' 

SerGlySerLeuSerProGluHisGlyProVal^ •'•" ) 



AGGTCTGGAACCTTCTGTCTGGCTGATACCTGCCTCCTGCTGATGGACAAGAGGAAAGAC 

■-^ _ + + + + + + < a) 

,Jt> TfCAGACCTTGGAAGACAGACCGACTATGGACGGAGGACGACTACCTGTTCTCCTTTCTG 

?2\ Arg5"erGly'IhrPheCysLeuAlaAspThrCysLeuLeuLeuMetA5DLysArgLysAsp ..-10 



COT: CT i CCGT1 GATATCAAGAAAGTGCTGITAGAAATGAGGAAGTTTCGGATGGGGTTG 



+ + + + + + - R0 

GGAAGAAGGCAACTATAGTTCTTTCACGACAATCTTTACTCCTTCAAAGCCTACCCCAAC 



.741 P ro^e rSe r V alAs p I Ye Ly s Ly s V a 1 LeuLeuG 1 uMetArgLysPheArgMetGlyLeu 260 

ATCCAGACAGCCGACCAGCTGCGCTrCrCCTACCTGGCTGTGATCGAAGGTGCCAAATTC 

701 + + + + + ~ 840 

TAG'i'CTGTCGGCTGGTCGACGCGAAGAGGATGGACCGACACTAGCTTCCACGGTTTAAG 

;>,31 1 1 eGl nThrAl aAspGl nLeuArgPheSerTyrLeuAl aVal 1 1 eGl uGlyAl aLysPhe 

ATCA r GGGGGACTCTTCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACCTGGAG 

P 41 + + ----+ --+ --- + " - + ' 00 

" "1AG7ACCCCCTGAGAAGGCACGTCCTAGTCACCTTCCTCGAAAGGGTGCTCCTGGACCTC 

1 1 eMetGl yAspSerSerVa 1G1 nAspGl nTrpLysGl uLeuSerHi sGl uAspLeub I u 

CCr-rArCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAATCClGGAGCCACACTGA 

cni ------ --- + -- + +-- + "-- + "1 %0 

GGGGGTGGGCTCGTATAGGGGGGTGGAGGGGCCGGTGGGTTTGCTTAGGACCTCGGTGTGACT 

301 PnProProGl uHi s I 1 eProProProProArgProProLysArg I 1 eLeuGl uProHi sEnd 3.0 
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■ 7 ' ' u . ICG;GACC7AAGGTAIAGGGTGACGG"! . I GGGGG^GGAAGICi AA;AGOGAiA 

t "G^Aj7T777A7CA7AATACACACCT7^GC7GC7GAAACGAAG 

6 i + + + + 170 

ACG1 OjAAAGTAGTAT [ATGTGTGGAAACGACGGC : 77GC7TC GGTC7G7 .GTCi AAAGG 



a::ag gaggatgtgggggctcaaggttc igctgctacgtg ; ggtgag: i i ;gctcigtac 

yp] 4. + + + + " 

^T:/jTCCTACACCCCCGAGiTCCAAGACGACj3ATGGACA(:CACTCGAAACGAGACATG 
MetTroGlyLeuLysVa 1 LeuLeuLeuProVa 1 Va 1 SerPheAl dLeuTyr 



CCTGAGGAGATAfTGGACACCCACTGGGAGCTATGGAAGAAGACCCACAGGA^CAATAT 

1^1 + + + + + -■- --+ "40 

GGAC "CCTCTATGACCTGTGGGTGACCCTCGATACCTTi J i CTGGGTGTCCTKGTTATA 
ProGluGluIlel.euAspThrHisTrpGluLeuTrpLy:LysThrHisArgl.y:,G-n:yr 

AACAACAAGGTGGATGAAATCTCTCGGCGTTTAATTTGGGAAAAAAACCTGAAGTATATT 

P41 + + + + + * 300 

T : G 1 TGTTCCACCTACTTTAGAGAGCCGCAAATTAAA ;CCTTTTTTTGGAC T TCATATAA 
A:,nAsnLysValAspGUiIleSerArqArgLeuIleTr|.GluLysA5nLeuLy:,TyrIle 

riXATCCATAACCTTGAGGCTTCTCTTGGTGTGCATACATATGAACTGGCTATGAACCAC 

3!H + + + ■---. + + + 3t)0 

AGGTAGGTATTGGAACTCCGAAGAGAACCACAGGTATGTATACTTGACCGATACTTGGTG 
Ser 1 1 eHi sAsnLeuGl uAl aSerLeu&l yVal Hi sThrTyrGl uLeuAl aMet AsnH i s 

(:;"gggggacatgaccagtgaagaggtggttcagaagatgactggai:ti:aaagtacccctg 

36] + + --- + + + -+ 420 

GAClCCCTGTACTGGTCACTTCTiXACCAAGTCTTCTACTGACCTGAGTT'I'CATGGGGAC 
L*?uGlyAspMetThrSerGluGluValValGlnLysMetThrGlyLi;uLysValProLeu 

■I(;tcattcccgcagtaatgacaccctttatatcccagaatgggaaggtagagccci:agac 

4?] + + + + + + 480 

agagtaagggcgtcattactgtgggaaatatagggtcttacccttccatgtcggggtctg 

SerhisSerArgSerAsnAspThrLeuTyrl leProGluTrpGluGlyArgAlaProAsp 

i ctgi ^:gai:tatcgamgamggatatgttactcctgtcaaaaatcagggtcagtgtggt 

481 + + + + + + 540 

AijA.CAGCTGATAGCTTTCTTTCCTATA'wAATGAGGACAGTTTTTAGTCCCAGTGACACCA 
SerVal Asp.TyrArgLysLysGlyTyrValThrProVal LysAsnGlnGlyGl nCysGly 
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TCCTGTTGGGCT! TTAGCTCTGTGGGTGCCC 1 GGAGGGCCMCTCMGMGAAMC7GGC 
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AGGArAACCCGAAMTCGAGACACCCACGGGACCTCCCGGT I GAG riCTTCTTTTGAGX 
S.erC^sTrpAlaPheSerSerValGlyAla_e'j'31uG1yGlnLeuLyslysl..ys>hrul.y 
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AAAC rCT T AAATCTGAGTCCCCAGAACCTAG 1 GGAT ; GTGTGTGTGAGAATGATGGCTGT 

_l_ _ ___-(-_ -- _ „ 4- _ _ _ - — - - - ~ 

[M TTTGAGAATTTAGACTCAGGGGTCTTGGATCA'XTAACACACAGACTCTTACTACCGACA 
LysLeuLeuAsnLeuSerProGlnAsnLeuValAspCysValSerGluAsnAspblyCyi. 

GGAGGGGGCTACATGACCAATGCCTTCCAATATGTGCAGAAGAACCGGGGTATTGACTCT 

f,f,] ._ + + + + 4 

^ .-CTrnXCGATGTACTGGTTACGGAAGGTTATACAC&TCTTCTTGGCCCCATAAuGAbA 
GlyG?yGlyTyrMRtThrAsnAlaPheGlriTyrValGlnLysAsnArgblyIleAsrber 

(iAAGATGCCTACCCATATGTGGGACAGGAACaAGAGTTGTATGiTAi; AACCCAACAGGCAAG 



7<:1 rTTCTACGGATGGGTATACACCCTGTCCTTCTiVrCAACA l ACATGTTGGGTTGTCCGTTC 
GluAspAlaTyrProTyrValGlyGlnGluGluSerCysMetTyrAsnProThrGlyLys 

GCA(iCTAAATGCA.3AGGGTACAGAGAGATCCCXGA(rjGGAV/GAGAAAGCCCTG/WjAGG 

731 rGTCGATTTACGTCTCCCATGTCTCTCTAGGGGCTCCCCTTACTr.TTTCGGGACTTClCC 
AlaAlaLysCysArgGlyTyrArgGl ulleProul uGlyAsnGl uLysAl aLeuLysArq 
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780 
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GCAGlGGCCCGAGTGGGACCTGTGTCrGFGGCCATTGATGCAAGCCrGACCTCC'TCCAG 

841 + + + + ' + ° 00 

TG7 rACCGGGCTCACCCTGGACAGAGACACCGGTAACTACGTTCi.GACTGGAGGAAGG FC 

Al aval Al aAraVa 1 GlyProVal SerVal Al all eAspA 1 aSerLeuThrSerPheb 1 n 



TTTTACAGCAAAGGTGTGTAT TATGATGAAAGCTGCAATAGCGA FAA1 C [GAACCATGGG 

GH1 + + + + + + - !00 

J AAAATG-CGTTTCCACACATAATACTACTT-CGACGTTATGGCTATTAGACTTGGTACGC 
PheTyrSerLysGlyValTyrTyrAspGluSerCysAsnSerAspAsnLeuAsnHisAla 

GTTTTGGCAGTGGGATATGGAATCCAGAAGGGAAACAAGCACTGGATAATTAAAAACAGC 

Qfil + + + -- + + - ;' + 10 ^° 

" CAAAACCGTCAGCCTATACC T TAGGTCTFCCCTTTGTTCGTGACC1ATTAATT ITTGTCG 
Val LeuAlaValGlyTyrGlyIleGlnLysGlyAsnLysHisTrpIlelleLysAsr6«?r 

TG6GGAGAAAACTGGGGAAACAAAGGA FA FATCCTCATGGCTCGAAATAAGAACAAC6CG 

,mj + + + + iuyu 

" AC CCCTCTTTTGACCCCTTTGTTTCCTATATAGGAGTAClXAGCTTTATTCTTGTl GCGG 
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G^yGluAsnrrpGlyAsnLysGlyTyrlleLeuMetAlaArgAsnLysAsnAsnAla 
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; TTGTiC T " A AAATGA1 (MJTCCTAC TTTGCT "CTCTC 1 ACCCATCACOT'l TTTCACTG I 
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AAAGACGAAGTTTACTAfiGAAGGA ! GAAACGAAGAGAGC7GGGTAC TG( iAAAAAGTGAlA 

GGCCA TCAGGACiTTCC CT GACAGt" TGTG I AC : CTTAGi iCTAAGAGA'l GTGACTACAGCC 

l.V; . + . - + 1380 

CCGXAGTCCTGAAAG&GACTG"; C i AC ACA 7GAGAATC C < iA ! : CTCTA :ACTGATGTCGG 

TGCCCCTGACTGTGTTG i • CCAGGGf TGATGCTGTACAGGAACAGGCTGGAGATTTTCAC 

ygi + + + f 144C 

ACGilGGACTGACACMC.AG'jGTCCCGACTACGACATGTCCATGTCCGACCTCTAAAAGTG 

ATAGGT'l AGATTCTCATTCACGGGAi' TAGTTAGCTTTAAGCA'CCTAGAGGACTAGGGTA 

1441 + + ■»■ + + + 1500 

TA1 CCAATCTAAGAGTAAGTGC CC !'; iATCAAT CGAAATTCGT GGGATC I'CCTGATCCCAT 

ATC rGACTTCTCACTTCCTAA&TTCC CTTCTATATC CTCAAGGTAGAAATGTCTATGTT r 

15Q1 + + + -- + + 1560 

* TAGACTGAAGAGTGAAGGATI CAAGGGAAGATATAGGAG Tl CCATCTT TACAGATACAAA 

TCTACTCCAATTCATAAATCTATTCATAAGTC'rTTGGTACAAGTTTACATGATAAAAAGA 

1561 + *- + ^ 1620 

AGATGAGG I i AAG 1 A i A I AGA T AAGT AT7C AG.AAAC CATGTTCAAATG ■ ACTATTTTTC \ 

AATGTGATTTGTCTTCCCTTCTTTGCACTT'nGAAATAAAGTATTTATC 

1^,21 + + + + 1669 

TTACACTAAACAGAAGGGAAGAAAi jJT 1AAAACTTTATTTCA I'AAA I AG 
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CTGCAGGAATTCGGGAGGAGGGGTGGTAl Ti G GAGGCGGTTG1 AfjAACi'T I AA ! AAAGG i A 
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GACGTCG iTAAGCCGTGG 1 GGCGACGATAAGACTuXf CAAGA \ <.. I 1 'A-M i A'! ! ! CCAT 

TCCATGGAGAACAC ' 'GAAAAGI GAGTGGATTC AAAA1 :. CAT I AAAAA'I TTGGAACGAAAG 



AGGTACCTCTTGTGACT I7TGAC JC ACCTAAuT IT IAOGTAATTTTTAAACCT' 

MetGl uAsnTnr Gl uAsnS-rVal A5pG.?rL,y3 ; 1 1 eLysAsnLeuGl u [, roLys 

A"P"ATACATGGAAGi' G WCAATGGACTCTGGAATA7C CCTGGAC AACAGT TAT AAAATG 

+ + + ----- + + 

TAGTATGTACCTTCi ^TTAGTTACGTGAGACCTTATAGGGAGC TG TTGT CAATATTTTAC 
11*11 pHi sGlySerGl uSerMetAspSorGlyl 1 eSerLeuAspAsnSerlyrLvsMet 
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TTGAACTTTATACT T CAG1 CCT TATTTTTACTAGAA IU GCACTTC1 TTAACACCTTAAC 
AsnLeuLysTyrGluValArgAsnLysAsnAspLf'uThrArgGljGlulleValGluLeu 



Ai^AAACTTTTTCAi'iAGGGGATCGT TGTAGMGTCTMCTGGAAAACCC AAACTTiTCAT I 

4Q| + + + + + + 

i GTTTGAAAAAGTCT :GCCTAGCAAC ATGTTGAGATTGAGCTTTTGGGTT TGAAAAGTAA 
ThrAsnPhePheArgGl.yAspArgCysArgSerLeuThrGlyLysProLysLeul-'helle 
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GATTATCCTGAGATGGGTTTATGTATAATAATTAATAArAAGAATTTTCATAAGAGCACT 
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fTAATAGGACTCTAi CCAAAT ACATA1 : A ; 'AATTA'""' AT TCTT AAAAGT AT T ' C T G GTG A 
AspTyrProGluMeiGlyLeuCvsllellelleAsnAinLysAsnPheHiSLysSerilir 

GGAATGACATCTCGGTCTGGTACAGATGlTGATGCAGi AAAGCTGAGGGAAACAT TCAGA 

+ + t + + + 

rci'TAf.TGTAi^GCr.AGACCATGTCTACAGi:TACGTCGTTlGGAGTCCCTTTGTAAGTCT 
GlvMetThrSerArgSerGlyThrAspValAspAlaAlaAsnLe'jArgGluThrPheArg 

\CTTGAAATATGAAGTCAGGAATAAAAA GATCTW AC GTGAA6AAA1 TGTGGAAI TG 
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ATGCGTGATGTTTCTAAAGAAGATCACAGCAAAAGGAi.CAGTTTTGTTTGTGTGC.TTCTG 

+ ..+ - + + + 420 

TA r GCACTACAAAGATTTCTTCTAGTGTGGTTTTCCTi.GTCAAAACAAACACACGAAGAC 
MetArgAspValSerLvsGluAspHisSerLysArgSerSerPheValCysValLeuLeu 

AGCCATGGTGAAGAAGGAATAATTTTTGGAACMATGi'aACCTGTTGACCTGAAAAAAATA 
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T(^GGTACCACTTr.lTGCT:ATTAAAAACGTTGTTTACCTGi,ACAAGJ GGAGTTT ■ T TTAT 

^-HisGlyGluGluGlvIlellePheGlyThrAsnGlyProValAspLeuLysLysIle 
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GGAGGTGG " 1A: TAT ACTTGGGGAAATTiAAAAGGATGGCTGA TGGI 1 CA 1 CCAG'XGCTT 
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CGTGGACCAATAATAAGjAACCGGTIG AAGT '! '! CC'i ACCGAGjACCAAG I'AGuTCAGCGAA 
A!a°roGlyTyrTyr0erTrpArqA::ri5 i ?rLysAGpGlyS'?rTrpPheI .eGlf'-Gorlcju 
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